[Solved] Duplicate finder
-
wrote on 25 Jan 2011, 13:01 last edited by
I would suggest using a QStringList instead of QString path[xx];
-
wrote on 25 Jan 2011, 13:04 last edited by
First: Then show us your code and we comment on it; Don't ask dumb questions that are clearly answered in the very good API docs the Trolls have created for us. It is very likely that you will not get any answer (apart from "RTFM"). We all put some valuable amount of time into DevNet to answer questions - with that silly game you are stealing this time!
Second: Do not use C-Style arrays in C++ if you are not absolutely forced to. Use the fine "Container Classes":http://doc.qt.nokia.com/stable/containers.html of Qt (or the equivalents of C++ standard library or boost). In your case "QStringList":http://doc.qt.nokia.com/stable/qstringlist.html is what you want.
Third: C-Style arrays of unknown size at compile time are not supported by all compilers and therefore not portable. I leave you to google or bing to search for the details.
-
wrote on 25 Jan 2011, 20:56 last edited by
Don't you think a hash computation is a little bit overkilling in the file duplicate determination?
Just read your files' contents into memory blocks and compare them with memcmp. If you may have big files it would be wiser to compare them block-by-block rather then the whole files at once.Upd. function name
-
wrote on 25 Jan 2011, 21:22 last edited by
memcpy copies in memory and does not compare them.
AFAIK, he wanted to search for duplicates, so hashes would be faster. YOu don't want to do a full compare for all files with all files....
-
wrote on 25 Jan 2011, 21:27 last edited by
Gerolf, thank you for the function name correction.
And yes, you are right about the question. I've misinterpreted OP goal :) -
wrote on 28 Jan 2011, 10:19 last edited by
How do you think, is it correct?
@
while(it.hasNext())
{
it.next();
if(it.peekPrevious().key()==it.peekNext().key())
std::cout<<it.peekPrevious().value()<<"="
<<it.peekNext().value()<<std::endl;
}
@ -
wrote on 28 Jan 2011, 11:04 last edited by
No it is not correct.
It compares only adjacent entries in your container.
If your container is a map (QMap or QHash) and you use insert() to populate it then you will get no duplicates at all, since every key occurs only once, hence the keys at different positions are all distinct.
You must use insertMulti() and values() to get a list of all entries with the same hash value. Or use QMultiMap/QMultiHash with the before mentioned methods.
-
wrote on 28 Jan 2011, 11:25 last edited by
Yes, you're right, thanks
-
wrote on 29 Jan 2011, 16:23 last edited by
i can't understand why it's not working :(
@
while(it.hasNext())
{
it.next();
if(it.key()==it.peekNext().key()) {
std::cout << "i've got you" << std::endl;
}
}
@PS: i've used insertMulti() to add item to hash, as you said
-
wrote on 29 Jan 2011, 16:39 last edited by
maybe like this?
@int compare_flag;while(it.hasNext()) { it.next(); compare_flag = QString::compare(it.key(),it.peekNext().key(),Qt::CaseSensitive); if(compare_flag==0) { std::cout << "i've got you" << std::endl; } }@
-
wrote on 29 Jan 2011, 16:45 last edited by
The keys in a (hash) map are always distinct. You will never find two identical keys so your comparison will never be true.
And even if you had identical keys in your container you would only find them if they are adjacent in the list.
But I'm going to have a kind of déjà-vu...
To make things clearer for us to understand: You do have a multi hash/multi map. What do you put in there and what do you expect to come out?
-
wrote on 29 Jan 2011, 17:01 last edited by
@QHash<QString,int> FilesHash;@
QString key is MD5
int value - just a number of fileon output i want to see the names of similar files
-
wrote on 29 Jan 2011, 17:15 last edited by
Ok, let's make things clearer step by step. Seems that you should make yourself comfortable with the concepts of a map.
A map (QHash is one) stores values associated with keys. Every key only exists once in the map - I wrote that several times, let's prove it:
@
QHash<QString, int> myHash;
myHash.insert("abc", 2);
myHash.insert("def", 3);
myHash.insert("abc", 5);qDebug() << "hash keys:" << myHash.keys();
QHash<QString, int> myMultiHash;
myMultiHash.insertMulti("abc", 2);
myMultiHash.insertMulti("def", 3);
myMultiHash.insertMulti("abc", 5);qDebug() << "multi hash keys:" << myMultiHash.keys();
@What will the output be?
What will happen if you compare every key with every other?
-
wrote on 29 Jan 2011, 17:34 last edited by
insertMulti allows you to store items with similar keys
-
wrote on 29 Jan 2011, 17:35 last edited by
without overwriting them
-
wrote on 31 Jan 2011, 08:58 last edited by
what do you think about it?
@
bool ok;
QHashIterator<QString,int> it(FilesHash);
QHashIterator<QString,int> begin(FilesHash);
QHashIterator<QString,int> end(FilesHash);
while(it.hasNext()) {
it.next();
begin = qLowerBound(FilesHash.begin(), FilesHash.end(), it.key());
end = qUpperBound(begin, FilesHash.end(), it.key());
iter = begin;
while(iter!=end) {
if(*i=*it) {
ok = true;
} else { ok = false; }
}
}
@ -
wrote on 31 Jan 2011, 10:18 last edited by
why i cannot do like this?
@QHashIterator<QString,int> iter(FilesHash);while(it.hasNext()) { it.next(); iter = qBinaryFind(FilesHash.begin(), FilesHash.end(), it.key()); }@
-
wrote on 2 Feb 2011, 08:01 last edited by
please note that the problem has been solved :)
-
wrote on 2 Feb 2011, 08:06 last edited by
you can do it on your own:
go to your first post and click edit :-)
and edit the title.
31/40