Hello. I often use Inkscape to trim the whitespace from my SVG figures created with Matplotlib (python library). However, I haven't found a way to generate these figures completely without a whitespace border.
In Inkscape, I select the plot, ungroup it twice, delete two empty objects near the border, and then use 'Resize Page to Selection' to remove the remaining whitespace.
Is there a way to automate this process, possibly using command-line Inkscape commands? Alternatively, if there's another method to completely remove the whitespace from SVG files without Inkscape, I'd appreciate learning about it.
Thank you very much for the code! I modified two functions, because your code sometimes deletes elements from the figure. Here is the modified version:
def crop_and_save_processed_svg(svg_element, largest_bbox_element, svg_file, output_folder):
temp_svg_filepath = os.path.join(temp_folder, svg_file)
# Instead of deleting the largest element, we now resize the SVG canvas to match
# the bounding box of the largest element. This ensures that only whitespace
# around the figure is removed, without deleting any significant part of the image.
if largest_bbox_element and hasattr(largest_bbox_element, 'bounding_box'):
bbox = largest_bbox_element.bounding_box()
svg_element.set('width', str(bbox.width))
svg_element.set('height', str(bbox.height))
svg_element.set('viewBox', f'{bbox.left} {bbox.top} {bbox.width} {bbox.height}')
with open(temp_svg_filepath, 'w') as output_file:
output_file.write(svg_element.tostring().decode('utf-8'))
output_filepath = os.path.join(output_folder, svg_file)
my_actions = 'select-all;fit-canvas-to-selection;'
export_actions = my_actions + f'export-type:svg;export-filename:{output_filepath};export-do;'
print(export_actions)
print(f'temp svg {temp_svg_filepath}')
inkex.command.inkscape(temp_svg_filepath, actions=export_actions)
def process_svg(svg_file, input_folder, output_folder):
try:
svg_element = inkex.load_svg(os.path.join(input_folder, svg_file)).getroot()
_, largest_bbox_element = get_largest_geometric_bbox(svg_element)
except:
print(f'cannot load svg {os.path.join(input_folder, svg_file)}')
return
# Passing the largest_bbox_element to the crop_and_save_processed_svg function
# to adjust the canvas size according to this element's bounding box.
crop_and_save_processed_svg(svg_element, largest_bbox_element, svg_file, output_folder)
I will do more testing and close the question if everything is working correctly.
I found files which can't be processed with this script.
Here is an example of a file: 335t_original.svg, and here is a modified version of your code I use (sorry for repeating):
# This function processes an SVG file
def process_svg(svg_file):
try:
# Load the SVG file and get the root element
svg_element = inkex.load_svg(svg_file).getroot()
# Get the largest bounding box element
_, largest_bbox_element = get_largest_geometric_bbox(svg_element)
except:
# If the SVG file cannot be loaded, print an error message
print(f'Cannot load svg {svg_file}')
return
# Crop the SVG and save the processed file
crop_and_save_processed_svg(svg_element, largest_bbox_element, svg_file)
# This function crops the SVG and saves it
def crop_and_save_processed_svg(svg_element, largest_bbox_element, svg_file):
# Change the SVG canvas size according to the bounding box
if largest_bbox_element and hasattr(largest_bbox_element, 'bounding_box'):
bbox = largest_bbox_element.bounding_box()
svg_element.set('width', str(bbox.width))
svg_element.set('height', str(bbox.height))
svg_element.set('viewBox', f'{bbox.left} {bbox.top} {bbox.width} {bbox.height}')
# Write the changes to the same SVG file
with open(svg_file, 'w') as output_file:
output_file.write(svg_element.tostring().decode('utf-8'))
# Define the actions for Inkscape command line
my_actions = 'select-all;fit-canvas-to-selection;'
export_actions = my_actions + f'export-type:svg;export-filename:{svg_file};export-do;'
# Execute the Inkscape actions
inkex.command.inkscape(svg_file, actions=export_actions)
# This function returns the largest geometric bounding box
def get_largest_geometric_bbox(svg_element):
element_area = 0
largest_bbox_element_id = None
largest_bbox_element = None
# Create a list of potential elements to check
element_list = svg_element.xpath('//svg:path | //svg:polygon | //svg:polyline | //svg:rect | //svg:use | //svg:image')
# Determine the largest bounding box
for element in element_list:
if hasattr(element, 'bounding_box'):
bbox_area = float(element.bounding_box().width) * float(element.bounding_box().height)
if bbox_area > element_area:
element_area = bbox_area
largest_bbox_element_id = element.get_id()
largest_bbox_element = element
else:
continue
return largest_bbox_element_id, largest_bbox_element
# Example usage
svg_file_name = 'file.svg'
if not os.path.exists(svg_file_name):
print(f'File Not Found: {svg_file_name}')
else:
process_svg(svg_file_name)
I get a message "Cannot load svg <filename>". It seems that `(element.bounding_box()` returns None. Could you please have a look at it and tell what is wrong?
Hello. I often use Inkscape to trim the whitespace from my SVG figures created with Matplotlib (python library). However, I haven't found a way to generate these figures completely without a whitespace border.
In Inkscape, I select the plot, ungroup it twice, delete two empty objects near the border, and then use 'Resize Page to Selection' to remove the remaining whitespace.
Here are examples of the original and trimmed figures: Trimmed SVG and Original SVG.
Is there a way to automate this process, possibly using command-line Inkscape commands? Alternatively, if there's another method to completely remove the whitespace from SVG files without Inkscape, I'd appreciate learning about it.
Easy way:
Looks like there is a group called
patch_1
which contains a path.If the naming is always the same ?
inkscape --actions="select-by-id:patch_1;delete;select-all:all;fit-canvas-to-selection;export-filename:output.svg;export-do;" 67t_original.svg
( on Windows you may need to use inkscapecom.com in 1.3 )
or - universal way, find the largest object bounding box, and use that id.
That would require 2 command lines.
process the output of
--query-all
to find the largest basic object (not group) bounding box using a script.then do the first example.
Thank you, it worked!
For most of my plots I have indeed the same name of the group.
I found out that I need to remove a group called
patch_2
as well. I'll leave my python code If someone google this because I mentioned matplotlib:Could you please give more details on the second, more universal way when there is no need to know group id?
If that is a python script, does it have access to
import inkex
?or not. It's a lot simpler if it does.
-----------
Also if you regularly process graphs like that, you might want to look at :
https://inkscape.org/~burghoff/%E2%98%85scientific-inkscape
Yes, sure, it is my script and I can import inkex.
Thanks for the link, interesting. But I only need to remove the whitespace from the svg figure.
In this case, If you know how to use python.
It's probably simpler to just loop through a restricted set of elements, check the bounding box size, find the largest.
Delete that element, save as a tempfile then do the command call.
Assuming that you have
source YOUR_INKEX_VENV/bin/activate
Give this a try, it should loop through all svg files in the folder.
Usage is
Thank you very much for the code! I modified two functions, because your code sometimes deletes elements from the figure. Here is the modified version:
I will do more testing and close the question if everything is working correctly.
Yes, I did write the code to delete the element with the largest bounding box.
I made an assumption that with a lot of graphs, there is a non graph related element to act as a background, which is often the case.
I didn't want to put to much info in the reply post.
Another way to get the bounding boxes ( visual and this is also the way to get text bounding boxes ) is as mentioned parsing the output of --query-all
This does however require 2 command calls instead of one. However it might end up being just as fast in this case ?
There is a function
def inkscape_command_call_bboxes_to_dict(self, input_file)
In this extension file I wrote: https://gitlab.com/inklinea/simple-frame/-/blob/main/inklinea.py?ref_type=heads
There is also a hideous example of parsing this information in bash here:
https://inkscape.org/forums/beyond/export-svg-without-objetct-outside-the-page-using-cli/
Thank you for replies!
I found files which can't be processed with this script.
Here is an example of a file: 335t_original.svg, and here is a modified version of your code I use (sorry for repeating):
I get a message "Cannot load svg <filename>". It seems that `(element.bounding_box()` returns None. Could you please have a look at it and tell what is wrong?
It's because there is an incomplete path element in the svg file:
<path id="STIXTwoText-20" transform="scale(0.015625)"/>
I would add a try except statement to solve this:
Thanks, the error is gone, however the whitespace is not trimmed.
I ran it through the original code I wrote, it did trim.
I think you might need to write some extra code to decide when to remove the largest object, and when not to.
OK, thanks, I see.