Replace Parts of Strings with HTML Tags



  • For my project I want to insert plain text from a QTextEdit into a PDF. I once thought of "it would be nice to make parts of the text bold, italic or as list.
    The plain text is saved as QString within application and in an XML-File. First I thought of writing the plain text with HTML Tags, but the HTML tags would destroy my XML file.

    Finally I had the idea to use the same formating syntax as this forum's text editor has, with asterisks for bold text and lists and underline for italic text. I want to replace this characters for HTML tags so that i can use QTextCursor::insertHtml(QString&) and in PDF I have my bold, italic and list structures.

    The only problem is how to replace this characters with tags. The best way to find text between asterisks etc. is to use regular expression. I need your opinion for my thoughts:

    replace characters like "<", ">" and "&" with HTML writing "&lt;", "&gt;" and "&amp;"

    find "newline, *, space, some text and double newline" and replace them with "<ul>some text</ul>"

    find "newline, *, space, some text and single newline" and replace them with "<li>some text</li>"

    replace "* + some text + *" with "<b>some text</b>"

    replace "_ + some text + _" with "<i>some text</i>"

    What do you think about this?

    lg Enforcer


  • Moderators

    [quote author="enforcer" date="1377849672"]First I thought of writing the plain text with HTML Tags, but the HTML tags would destroy my XML file.[/quote]
    You can use "CDATA":http://www.w3schools.com/xml/xml_cdata.asp in that case to keep the XML valid if thats an option for you.



  • I'll use that as emergency solution, but I want to make a simple solution for users not familiar with HTML. This forum's text editor uses a pretty simple syntax. I'd like to try this one first.



  • This forum's text editor uses several regular expressions for replacements. The code can be found when you use your browser (Google Chrome with F12) and check the network's GETs and POSTs. Within the GETs there is JavaScript file where the regex is defined.

    The problem is. it is a ciphered string and only the above designed algorithm can decipher it. Might take some time to get the right regexp.



  • What happens when you enter
    @* bla _ bla * bla _@

    As you see, regular expressions are not sufficient to parse this. You'll also need a method to validate the markup. Just write a small parser and everything will be good.


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.