Inkscape.org
Beyond the Basics Trim whitespace from svg command-line
  1. #1
    pozitron57 pozitron57 @pozitron57

    Hello. I often use Inkscape to trim the whitespace from my SVG figures created with Matplotlib (python library). However, I haven't found a way to generate these figures completely without a whitespace border.

    In Inkscape, I select the plot, ungroup it twice, delete two empty objects near the border, and then use 'Resize Page to Selection' to remove the remaining whitespace.

    Here are examples of the original and trimmed figures: Trimmed SVG and Original SVG.

    Is there a way to automate this process, possibly using command-line Inkscape commands? Alternatively, if there's another method to completely remove the whitespace from SVG files without Inkscape, I'd appreciate learning about it.

  2. #2
    inklinea inklinea @inklinea⛰️

    Easy way:

    Looks like there is a group called patch_1 which contains a path. 

    If the naming is always the same ? 

    inkscape --actions="select-by-id:patch_1;delete;select-all:all;fit-canvas-to-selection;export-filename:output.svg;export-do;" 67t_original.svg

    ( on Windows you may need to use inkscapecom.com in 1.3 )

    or - universal way, find the largest object bounding box, and use that id.

    That would require 2 command lines.

    process the output of --query-all to find the largest basic object (not group) bounding box using a script.

    then do the first example.

  3. #3
    pozitron57 pozitron57 @pozitron57
    *

    Thank you, it worked!

    For most of my plots I have indeed the same name of the group.

    I found out that I need to remove a group called patch_2 as well. I'll leave my python code If someone google this because I mentioned matplotlib:

     

    import os
    
    full_path='/home/user/dir/file.svg'
    
    inkscape_command = f'inkscape --actions="select-by-id:patch_1,patch_2;delete;select-all:all;fit-canvas-to-selection;export-filename:{full_path};export-do;" {full_path}'
    
    os.system(inkscape_command)

     

    Could you please give more details on the second, more universal way when there is no need to know group id?

  4. #4
    inklinea inklinea @inklinea⛰️

    If that is a python script, does it have access to 

    import inkex ? 

    or not. It's a lot simpler if it does.

    -----------

    Also if you regularly process graphs like that, you might want to look at :

    https://inkscape.org/~burghoff/%E2%98%85scientific-inkscape

  5. #5
    pozitron57 pozitron57 @pozitron57

    Yes, sure, it is my script and I can import inkex.

    Thanks for the link, interesting. But I only need to remove the whitespace from the svg figure.

  6. #6
    inklinea inklinea @inklinea⛰️

    In this case, If you know how to use python.

    It's probably simpler to just loop through a restricted set of elements, check the bounding box size, find the largest.

    Delete that element, save as a tempfile then do the command call.

    Assuming that you have source YOUR_INKEX_VENV/bin/activate

    Give this a try, it should loop through all svg files in the folder.

    Usage is 

    remove_graph_frame.py input_folder output_folder
    import inkex
    
    import os, sys
    import shutil
    import tempfile
    
    
    def get_files_in_folder(folder, extension=None):
    
        files = os.listdir(folder)
        if len(files) < 1:
            inkex.errormsg('No Files Found')
            return None
    
        filepath_list = [x for x in files if(x.endswith('.svg'))]
        return filepath_list
    
    
    def process_svg(svg_file, input_folder, output_folder):
        try:
            svg_element = inkex.load_svg(os.path.join(input_folder, svg_file)).getroot()
            largest_bbox_element_id, largest_bbox_element = get_largest_geometric_bbox(svg_element)
            if largest_bbox_element_id:
                print(largest_bbox_element_id)
            if largest_bbox_element_id:
                largest_bbox_element.delete()
    
        except:
            print(f'cannot load svg {os.path.join(input_folder, svg_file)}')
            return
    
        crop_and_save_processed_svg(svg_element, svg_file, output_folder)
    
    def crop_and_save_processed_svg(svg_element, svg_file, output_folder):
    
        temp_svg_filepath = os.path.join(temp_folder, svg_file)
    
        with open(temp_svg_filepath, 'w') as output_file:
            output_file.write(svg_element.tostring().decode('utf-8'))
    
        output_filepath = os.path.join(output_folder, svg_file)
    
        my_actions = 'select-all;fit-canvas-to-selection;'
    
        export_actions = my_actions + f'export-type:svg;export-filename:{output_filepath};export-do;'
    
        print(export_actions)
    
        print(f'temp svg {temp_svg_filepath}')
    
        inkex.command.inkscape(temp_svg_filepath, actions=export_actions)
    
    
    def get_largest_geometric_bbox(svg_element):
    
        element_area = 0
        largest_bbox_element_id = None
        largest_bbox_element = None
    
        element_list = svg_element.xpath('//svg:path | //svg:polygon | //svg:polyline | //svg:rect | //svg:use | //svg:image')
        for element in element_list:
            if hasattr(element, 'bounding_box'):
                bbox_area = float(element.bounding_box().width) * float(element.bounding_box().height)
                if bbox_area > element_area:
                    element_area = bbox_area
                    largest_bbox_element_id = element.get_id()
                    largest_bbox_element = element
            else:
                continue
    
        return largest_bbox_element_id, largest_bbox_element
    
    
    input_folder = sys.argv[1]
    output_folder = sys.argv[2]
    temp_folder = tempfile.mkdtemp()
    
    if not os.path.isdir(input_folder):
        print(f'Folder Not Found')
        print(f'Usage remove_graph_frame.py input_folder output_folder')
        sys.exit()
    
    filepath_list = get_files_in_folder(input_folder, extension='.svg')
    
    for svg_file in filepath_list:
        print(f'Current SVG {svg_file}')
        process_svg(svg_file, input_folder, output_folder)
    
    shutil.rmtree(temp_folder)
    
    
    

     

     

  7. #7
    pozitron57 pozitron57 @pozitron57
    *

    Thank you very much for the code! I modified two functions, because your code sometimes deletes elements from the figure. Here is the modified version:

    def crop_and_save_processed_svg(svg_element, largest_bbox_element, svg_file, output_folder):
        temp_svg_filepath = os.path.join(temp_folder, svg_file)
    
        # Instead of deleting the largest element, we now resize the SVG canvas to match
        # the bounding box of the largest element. This ensures that only whitespace
        # around the figure is removed, without deleting any significant part of the image.
        if largest_bbox_element and hasattr(largest_bbox_element, 'bounding_box'):
            bbox = largest_bbox_element.bounding_box()
            svg_element.set('width', str(bbox.width))
            svg_element.set('height', str(bbox.height))
            svg_element.set('viewBox', f'{bbox.left} {bbox.top} {bbox.width} {bbox.height}')
    
        with open(temp_svg_filepath, 'w') as output_file:
            output_file.write(svg_element.tostring().decode('utf-8'))
    
        output_filepath = os.path.join(output_folder, svg_file)
        my_actions = 'select-all;fit-canvas-to-selection;'
        export_actions = my_actions + f'export-type:svg;export-filename:{output_filepath};export-do;'
        print(export_actions)
        print(f'temp svg {temp_svg_filepath}')
        inkex.command.inkscape(temp_svg_filepath, actions=export_actions)
    
    
    def process_svg(svg_file, input_folder, output_folder):
        try:
            svg_element = inkex.load_svg(os.path.join(input_folder, svg_file)).getroot()
            _, largest_bbox_element = get_largest_geometric_bbox(svg_element)
    
        except:
            print(f'cannot load svg {os.path.join(input_folder, svg_file)}')
            return
    
        # Passing the largest_bbox_element to the crop_and_save_processed_svg function
        # to adjust the canvas size according to this element's bounding box.
        crop_and_save_processed_svg(svg_element, largest_bbox_element, svg_file, output_folder)

     

    I will do more testing and close the question if everything is working correctly.

  8. #8
    inklinea inklinea @inklinea⛰️

    Yes, I did write the code to delete the element with the largest bounding box. 

    I made an assumption that with a lot of graphs, there is a non graph related element to act as a background, which is often the case. 

    I didn't want to put to much info in the reply post. 

    Another way to get the bounding boxes ( visual and this is also the way to get text bounding boxes ) is as mentioned parsing the output of --query-all

    This does however require 2 command calls instead of one. However it might end up being just as fast in this case ? 

    There is a function def inkscape_command_call_bboxes_to_dict(self, input_file)

    In this extension file I wrote: https://gitlab.com/inklinea/simple-frame/-/blob/main/inklinea.py?ref_type=heads

    There is also a hideous example of parsing this information in bash here:

    https://inkscape.org/forums/beyond/export-svg-without-objetct-outside-the-page-using-cli/

  9. #9
    pozitron57 pozitron57 @pozitron57
    *

    Thank you for replies!

    I found files which can't be processed with this script.

    Here is an example of a file: 335t_original.svg, and here is a modified version of your code I use (sorry for repeating):

     

    # This function processes an SVG file
    def process_svg(svg_file):
        try:
            # Load the SVG file and get the root element
            svg_element = inkex.load_svg(svg_file).getroot()
            # Get the largest bounding box element
            _, largest_bbox_element = get_largest_geometric_bbox(svg_element)
        except:
            # If the SVG file cannot be loaded, print an error message
            print(f'Cannot load svg {svg_file}')
            return
        # Crop the SVG and save the processed file
        crop_and_save_processed_svg(svg_element, largest_bbox_element, svg_file)
    
    # This function crops the SVG and saves it
    def crop_and_save_processed_svg(svg_element, largest_bbox_element, svg_file):
        # Change the SVG canvas size according to the bounding box
        if largest_bbox_element and hasattr(largest_bbox_element, 'bounding_box'):
            bbox = largest_bbox_element.bounding_box()
            svg_element.set('width', str(bbox.width))
            svg_element.set('height', str(bbox.height))
            svg_element.set('viewBox', f'{bbox.left} {bbox.top} {bbox.width} {bbox.height}')
        # Write the changes to the same SVG file
        with open(svg_file, 'w') as output_file:
            output_file.write(svg_element.tostring().decode('utf-8'))
        # Define the actions for Inkscape command line
        my_actions = 'select-all;fit-canvas-to-selection;'
        export_actions = my_actions + f'export-type:svg;export-filename:{svg_file};export-do;'
        # Execute the Inkscape actions
        inkex.command.inkscape(svg_file, actions=export_actions)
    
    # This function returns the largest geometric bounding box
    def get_largest_geometric_bbox(svg_element):
        element_area = 0
        largest_bbox_element_id = None
        largest_bbox_element = None
        # Create a list of potential elements to check
        element_list = svg_element.xpath('//svg:path | //svg:polygon | //svg:polyline | //svg:rect | //svg:use | //svg:image')
        # Determine the largest bounding box
        for element in element_list:
            if hasattr(element, 'bounding_box'):
                bbox_area = float(element.bounding_box().width) * float(element.bounding_box().height)
                if bbox_area > element_area:
                    element_area = bbox_area
                    largest_bbox_element_id = element.get_id()
                    largest_bbox_element = element
            else:
                continue
        return largest_bbox_element_id, largest_bbox_element
    
    # Example usage
    svg_file_name = 'file.svg'
    if not os.path.exists(svg_file_name):
        print(f'File Not Found: {svg_file_name}')
    else:
        process_svg(svg_file_name)


     

    I get a message "Cannot load svg <filename>". It seems that `(element.bounding_box()` returns None. Could you please have a look at it and tell what is wrong?

  10. #10
    inklinea inklinea @inklinea⛰️

    It's because there is an incomplete path element in the svg file:

    <path id="STIXTwoText-20" transform="scale(0.015625)"/>

    I would add a try except statement to solve this:

    try:
        if hasattr(element, 'bounding_box'):
            bbox_area = float(element.bounding_box().width) * float(element.bounding_box().height)
            if bbox_area > element_area:
                element_area = bbox_area
                largest_bbox_element_id = element.get_id()
                largest_bbox_element = element
        else:
            continue
    except:
        continue

     

  11. #11
    pozitron57 pozitron57 @pozitron57
    *

    Thanks, the error is gone, however the whitespace is not trimmed. 

  12. #12
    inklinea inklinea @inklinea⛰️

    I ran it through the original code I wrote, it did trim.

    I think you might need to write some extra code to decide when to remove the largest object, and when not to.

  13. #13
    pozitron57 pozitron57 @pozitron57

    OK, thanks, I see.

Inkscape Inkscape.org Inkscape Forum Beyond the Basics Trim whitespace from svg command-line