Editing .pptx in Qt
-
I have such problem - i am trying to edit slide*.xml in .pptx. I unpack .pptx, then in loop I readline() from .xml file, modify each line ( i am modifying only visble text), and write it to QTemporaryfile. after that i replace source .xml with tempfile (remove source file and QTemporaryFile::copy()). But after zipping powerpoint saying to me that it cannot open file saying something about recovering. I compared source file and modifyed file, the only change is modifyed text. I tried to do that process manually, but everything is working only if i modify source .xml file, replacing it with other file, that contains the same information gives the error, also i tried to setPermissions to the copied tempfile that source file had had, but it did not help.
How to modify .pptx correctly? Maybe power point .xml have some kind of encrypting? like additionals bytes written in the end? -
No particular "kind of encrypting", just plain old xml.
then in loop I readline() from .xml file
I would use the QtXml module here to make the changes, see http://www.qtcentre.org/threads/14392-how-to-modify-the-text-of-node-in-xml-docment
To identify the problem better I'd ask for two things:
- using your method and modifying only one xml file then telling powerpoint to recover your file, are all the slides messed up or just the one you modified?
- If you manually edit the xml does it work?
also,
after zipping
what do you use to zip it?
-
I'm not familiar with pptx file format, but is it possible that PowerPoint stores file checksums somewhere in the archive and checks them when loading?
-
@VRonin I have tried to use winrar and 7z. Problem is only with slides I modified. If I modify the exciting .xml, everything's ok, but replacing it with file, that comtains same text gives an error.
I want to modify values of each field, where Cyrillic text was replaced with question marks due to some kind of bug in the program (not mine). -
I have found interesting thing - every xml file, that is created by PowerPoint has some invisible symbols at the beginning, that could not be deleted by editor (like notepad).
slide1.xml is file generated by the program and slide3.xml - is an original file, i copied text from slide1.xml and replaced text in slide3 with it. Here is a resultC:\Users\Denis\Documents\apertium-gp-docs\p.GN2aKw\ppt\slides>fc /A slide1.xml slide3.xml
Comparing files slide1.xml and SLIDE3.XML
***** slide1.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocumen
***** SLIDE3.XML
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocumen
-
that looks like a BOM maker. for utf-8
http://stackoverflow.com/questions/4614378/getting-ï-at-the-beginning-of-my-xml-file-after-save -
@mrjj Thanks, your answer showed me the right way for my search. there was no BOM in the original file, it appeared there due to my fault :)
So, i have solved it finally. There was two problems.
First one was encodingof Cyrrilic symbols when I was writing to the temporary file. It can be solved by setting encoding manually.
QTextStream stream(&tmpSlide);
stream.setCodec(QTextCodec::codecForName("UTF-8"));And the second one - i missed to set the type of archive for 7z
It can be solved by adding -tzip to args.