March - AI Extension 2023
2023年4月5日17:23
For the first time, the Inkscape project has hired an external contractor to develop a feature for Inkscape. This is a huge step for us as a project, and your donations made that happen! This particular feature was selected to be import of the Illustrator file format, and it's developed as a python extension to also make it useful for other free software projects.
This post outlines the progress in the first month of the project, March 2023. Work is carried out inside the git repository extension-ai.
Inkscape already seems to support .ai files. What do we need a new importer for?
Modern (after 2000) AI files are actually PDF files - so they can be displayed with any PDF viewer. And when Inkscape opens an .ai file, it just calls the PDF importer. PDF is a lossy format - a lot of the information that is contained in any file exported to PDF is lost, just like taking a photo of a three-dimensional scene.
This information which can't be represented in the PDF - like text layouting metadata, path effects, filters, mesh gradient arrangements, layers and much more - is embedded as compressed datastream inside the PDF file, which is the actual AI file. After completion of this project, we want to be able to discard the PDF data and have a high-fidelity import of AI files, solely based on this embedded datastream - and this is the work that we have contracted out.
As a starting point, the Inkscape team had prepared the the extraction of the plain text behind the compressed datastream. The compression algorithm was changed a few times, and we needed to adapt to that.
Nicco Kunzmann
Nicco Kunzmann's work is done within a contract with the SFC.
- Adding pytest fixtures and tox tests for
- REUSE - licensing compliance
- parser, SVG conversion and command line use for Python versions 3.7+
- Implement most important sections of the grammar according to the AI Spec in pyparsing
- Convert paths, layers and groups to SVG
- Create a parser for the hierarchical data structure that was introduced (unspecified), used e.g. for document metadata and SVG filters
- Create an architecture to split SVG construction from pyparsing grammar construction (cleaner code)
- Refactoring the pyparsing parser for speed.
- Building tools to reverse-engineer the grammar:
- Extraction of elements inside the files (text, whole document)
- Parsing sub-elements of the grammar at specific places
- Test cases to match elements
- Benchmark pyparsing to make decisions about fast implementation
- Benchmark the hierarchical data structure parser implementations using pyparsing and a simple python parser to make decisions on how to approach further parser implementations
- Document deviations and additions to the 1998 AI Specification
- Create tests cases that run on every .ai file to check the progress in conversion to SVG
To summarize: after the first month, we are able to open an AI file with a simple path such as Inkscape's logo: screen cast. We have learned a lot about parsing performance and have a robust unit testing framework, which should be able to accommodate all following features.