Circumventing QTextDocument restriction to single-line regex search



  • The find(regex) method of QTextDocument will not attempt to match across a "block" boundary (see answers in "this post":http://developer.qt.nokia.com/forums/viewthread/6665). As a result, although the QRegExp doc says that \n is matched by .* and by \s, one can never get a match to \n in a QPlainTextEdit document. The \n is end of block and the search never spans a block. Thus it seems impossible for a pattern like
    @
    <b>.*</b>
    @
    to find a match when the markup begins on one line and ends on another.

    There is a way around this. The restriction is in QTextDocument; QRegExp does perform as documented when it is applied to a QString. So the following code can apply a general regex to a document. This is using PyQt4; the translation to C++ should be clear. Given is a QPlainTextEdit qpte whose cursor is the starting point for the search. Also given is a QRegExp qrxp that has been prepared with a search pattern such as "<b>.*</b>" and its minimal and case switches set.

    @
    start_tc = qpte.textCursor() # cursor with starting position
    range_tc = QTextCursor(start_tc) # make a copy linked to same doc
    range_tc.movePosition(QTextCursor.End) # point to end of doc

    set cursor to select all text from starting point to end

    range_tc.setPosition(start_tc.selectionStart(),QTextCursor.KeepAnchor)

    apply regexp to (part of) the document as a QString

    hit_pos = qrxp.indexIn(range_tc.selectedText())
    if hit_pos > -1 : # first occurrence at hit_pos offset
    find_tc = QTextCursor(start_tc) # another cursor
    find_tc.setPosition(start_tc.selectionStart()+hit_pos) # point to hit
    find_tc.movePosition(QTextCursor.Right,
    QTextCursor.KeepAnchor,
    qrxp.matchedLength()) # select matched text
    qpte.setTextCursor(find_tc) # match visible to user
    @
    There remains one restriction: QTextCursor.selectedText() returns a string in which the \n character has been replaced with Unicode paragraph separator. This character does match to \s and .* however to match a literal \n you have to change the pattern given to the regexp, replacing all \n with \x2029.



  • This reads like a nice howto. Would you considder putting the above in a page on the wiki?


Log in to reply
 

Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.