← Back to team overview

zim-wiki team mailing list archive

Re: Save simple text emails and remove the end of line (manuall line wrapping)

 

On Thu, Apr 7, 2011 at 4:50 AM, nomnex <nomnex@xxxxxxxxx> wrote:

> I need to save simple text emails in Zim, but to remove the manual line
> breaks (msg end of line of about ~70 chars), without removing the
> paragraph breaks.
>
> an online tool that does it:
> http://www.textfixer.com/tools/remove-line-breaks.php
>
> To achieve the same result in Zim after I copied and past the message
> text, I probably need to select the paragraphs I need, and to use the
> "replace" function with a regex.
>
> Can you help? Thanks.
>

Sounds like a great example to show the usage of the "custom tools".

Attached a script, put it somewhere, make it executable, go into zim "Tools
-> Custom Tools" and add the script with the "%f" commandline argument.

This will add the script in the tools menu and you can process pages in one
go.

-- Jaap
#!/usr/bin/python

import sys
import re


def split_headers(text):
	'''Split zim headers from text and removes both seperately'''
	if text.startswith('Content-Type:'):
		# mail style headers
		headers, text = text.split('\n\n', 1)
			# split on first empty line
		return headers, text
	else: 
		# no zim headers
		return '', text


def join_headers(headers, text):
	'''Join zim headers with body text, returns single page source'''
	if headers:
		return headers.rstrip() + '\n\n' + text.lstrip()
	else:
		return text.lstrip()


def remove_line_breaks(text):
	'''Removes line breaks within paragraphs, but keeps empty lines'''
	pattern = re.compile(r'^[ \t]+\n', re.M) # pattern for empty lines
	text = pattern.sub('\n', text) # fix empty lines to be really empty

	pattern = re.compile(r'(?<!==)\n') # pattern for newline not at end of heading in zim wiki syntax
	parts = text.split('\n\n') # split on empty lines
	parts = [pattern.sub(' ', p) for p in parts] # replace line breaks with space
	parts = [p for p in parts if len(p) and not p.isspace()] # remove empty para
	return '\n\n'.join(parts) # join with empty lines


def remove_line_breaks_in_zim_page(file):
	'''Remove line breaks in a zim page file'''
	fh = open(file)
	text = fh.read()
	fh.close()

	headers, text = split_headers(text)
	text = remove_line_breaks(text)
	text = join_headers(headers, text)

	fh = open(file, 'w')
	fh.write(text)
	fh.close()


if __name__ == '__main__':
	file = sys.argv[1] # first commandline argument
	remove_line_breaks_in_zim_page(file)


Follow ups

References