Poor performance COP (compared to Aperture) searching using
Coming from Aperture I noticed a poor performance in COP in big Catalogs (40.000 images) regarding searching using Keywords.
I found in this forum the option to use "Hardware acceleration (Use OpenCl for).
This helped for a lot for smaller Catalogs, but my big Catalog was still performing poorly.
Than I found that both applications are using SqLite as a database.
Via sqlite.org I installed some utility programs and compared the databases.
(Aperture seems to be using more than 1 database, but I only compared 1)
I am not an database expert, but I noticed that in the Aperture database are 125 Indexes, as opposed to COP only 23. Indexes are used for speedy searches as far as I know, so this could explain the difference in performance.
Now PhaseOne also has Media Pro SE. I have not tried this program, but I assume it also uses Sqlite for a database.
As I have read it is "blazing fast". So I suppose the knowledge how to build a fast database is available.
Which means it is more a political/commercial/business-model decision of PhaseOne on how much to merge both programs into one, or how much functionality of Media Pro SE is put into COP.
Jan Wessel
Below some data out of the sqlite3_analyzer utility
/** Disk-Space Utilization Report For /Users/imacjan/Pictures/Aperture_Librarys/Fotoos_Jan.aplibrary/Database/apdb/Library.apdb
Page size in bytes................................ 16384
Pages in the whole file (measured)................ 34799
Pages in the whole file (calculated).............. 34799
Pages that store data............................. 34788 99.968%
Pages on the freelist (per header)................ 0 0.0%
Pages on the freelist (calculated)................ 0 0.0%
Pages of auto-vacuum overhead..................... 11 0.032%
Number of tables in the database.................. 35
Number of indices................................. 125
Number of defined indices......................... 125
Number of implied indices......................... 0
Size of the file in bytes......................... 570146816
Bytes of user payload stored...................... 396017893 69.5%
/** Disk-Space Utilization Report For /Volumes/LaCieRAID1/CaptureOneProCatalogs/1972-2015.cocatalog/1972-2015.cocatalogdb
Page size in bytes................................ 1024
Pages in the whole file (measured)................ 210319
Pages in the whole file (calculated).............. 210319
Pages that store data............................. 210319 100.0%
Pages on the freelist (per header)................ 0 0.0%
Pages on the freelist (calculated)................ 0 0.0%
Pages of auto-vacuum overhead..................... 0 0.0%
Number of tables in the database.................. 19
Number of indices................................. 23
Number of defined indices......................... 20
Number of implied indices......................... 3
Size of the file in bytes......................... 215366656
Bytes of user payload stored...................... 179007094 83.1%
I found in this forum the option to use "Hardware acceleration (Use OpenCl for).
This helped for a lot for smaller Catalogs, but my big Catalog was still performing poorly.
Than I found that both applications are using SqLite as a database.
Via sqlite.org I installed some utility programs and compared the databases.
(Aperture seems to be using more than 1 database, but I only compared 1)
I am not an database expert, but I noticed that in the Aperture database are 125 Indexes, as opposed to COP only 23. Indexes are used for speedy searches as far as I know, so this could explain the difference in performance.
Now PhaseOne also has Media Pro SE. I have not tried this program, but I assume it also uses Sqlite for a database.
As I have read it is "blazing fast". So I suppose the knowledge how to build a fast database is available.
Which means it is more a political/commercial/business-model decision of PhaseOne on how much to merge both programs into one, or how much functionality of Media Pro SE is put into COP.
Jan Wessel
Below some data out of the sqlite3_analyzer utility
/** Disk-Space Utilization Report For /Users/imacjan/Pictures/Aperture_Librarys/Fotoos_Jan.aplibrary/Database/apdb/Library.apdb
Page size in bytes................................ 16384
Pages in the whole file (measured)................ 34799
Pages in the whole file (calculated).............. 34799
Pages that store data............................. 34788 99.968%
Pages on the freelist (per header)................ 0 0.0%
Pages on the freelist (calculated)................ 0 0.0%
Pages of auto-vacuum overhead..................... 11 0.032%
Number of tables in the database.................. 35
Number of indices................................. 125
Number of defined indices......................... 125
Number of implied indices......................... 0
Size of the file in bytes......................... 570146816
Bytes of user payload stored...................... 396017893 69.5%
/** Disk-Space Utilization Report For /Volumes/LaCieRAID1/CaptureOneProCatalogs/1972-2015.cocatalog/1972-2015.cocatalogdb
Page size in bytes................................ 1024
Pages in the whole file (measured)................ 210319
Pages in the whole file (calculated).............. 210319
Pages that store data............................. 210319 100.0%
Pages on the freelist (per header)................ 0 0.0%
Pages on the freelist (calculated)................ 0 0.0%
Pages of auto-vacuum overhead..................... 0 0.0%
Number of tables in the database.................. 19
Number of indices................................. 23
Number of defined indices......................... 20
Number of implied indices......................... 3
Size of the file in bytes......................... 215366656
Bytes of user payload stored...................... 179007094 83.1%
0
-
As a professional with more then 30 years experience with database applications, data warehouse etc. I can tell you, that it is totally useless to compare two database models of two different applications without knowing anything about the two applications, their internal models, and how they work.
In theory, it can be that the CO1 database model is far better for the CO1 application than an Aperture like database model. Of cause, the inverse can also be true.
You must know, for instance, when CO1 creates, reads, updates, deletes records in its internal data model, and when it wants to store this information persistantly in the database. Another aspect is caching. Which parts of the data in the database are read once and then cached in memory. You don't need any index if you do a full table scan once and then store the data in memory.
When you have all the necessary information, then you can think of a wellsuited database model and compare it with reality, i.e. the one used by CO1.
But what you can do, is to compare the performance of e.g. CO1 and Aperture on a large catalogue, e.g. 50'000 images, when searching on key words. And if one is much faster than the other, then there is definitely room for improvement for the slower one. But the problem can lie everywhere.
And sorry, if my tone is a little bit rude. It is not against you.
--peter0 -
Hi Peter,
Thanks for your response.
No problem being rude, you seem to know what you are talking about.
My experience is with IBM's DB2 and Unisys's DMS-1100 (in the mainframe world), and MySQl on PC's.
I understand you should not be comparing two systems at random.
But, with Aperture and COP (or C1P or CO1 not sure what the correct naming is), both being applications that have a DAM functionality and a search function regarding keywords, they are comparible in that.
And COP is performing very poor when searching in a Catalog with 50.000 images.
Minutes in COP versus seconds in Aperture.
That's why I tried to understand how it works.
And the concept of Sqlite, was completely new to me.
Apparently I have missed this development.
"A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file."
And as I pointed out, the improvement that PhaseOne undoubtedly is capable of doing, might be depending on other considerations.
Jan0 -
Same problem here, I am sport photographer, I make about 300.000 pictures by year and have very big catalogues.
I have work with Aperture, Aperture is speedy with big Catalogues.
CO have better functionality, image quality, etc, but I am disappointed when i need to wait lot of minutes to access pictures.
Phase One : make something to boost database !!!0 -
[quote="yann.queniart" wrote:
Phase One : make something to boost database !!!
Yes, please !! It's the only serious issue of the software.0 -
I notice that COP seems to use a database page size of 1024.
An adequate size for 2003, but with today's hardware a better/faster choice might be 4096, according to sqlite.org
It might then be very interesting if someone rebuilt a COP database with a page size of 4096 and made speed comparisons with the same database@pagesize1024.
Not a mac user myself, but on Windows I believe the rebuild could be accomplished in less than an hour.
And, I agree with the OP that the number of indices seems inadequate; fewer than the number of tables? But adding indices is an easy task, so why all the speculation and no experimentation?
Mogens0 -
This is a very interesting discussion! Has anyone offered these ideas to Phase One via a support request? I would assume they have database experts on staff, but maybe the community (not me) has some expertise there as well. 0 -
First let me explain my situation.
I have an Aperture Library containing about 45.000 images.
These images are referenced and located on an external Lacie Thunderbolt drive.
In a folder, with subfolders per year, and these have subfolders per project.
(these 45.000 images are 50% JPG’s from scanned negatives, and 50% from different digital camera’s).
(I found out that at some point in history I have written the Keywords from the JPG’s to the image files themselves. Probably using Aperture’s Metadata tab , Write IPTC metadata to original.)
I exported from Aperture bunches of around 10.000 images to new Aperture Library’s.
And these Library’s I imported into COP resulting in one COP catalog with 45.000 images.
(Importing the original Aperture Library into COP resulted in COP crashing.)
All Keywords from Aperture arrived in COP. Happy with that.
But searching for images containing a specific Keyword takes minutes.
I installed the trial version of Media Pro SE.
It took about 5 hours for Media Pro SE to scan the 45.000 images.
But after that searching for images with a specific Keyword is very fast.
At this stage I found out that only the Keywords of the images that I had written IPTC metadata to original, were in Media Pro SE.
To get the Keywords from COP to Media Pro SE, you need to set COP to use XMP sidecar files. Once you have done that, the metadata get synced between the COP-database and Media Pro SE-database.
And I could search in Media Pro SE for images with a specific Keyword just as I could in Aperture. And fast!
But… I don’t want XMP sidecar files.
Don’t know why I just don’t want it.
Then I installed a trial version of Photo Mechanic 5.
This program let’s you manipulate all kinds of metadata, including Keywords.
(Even hierarchical keywords).
It writes directly into the imagefile, and does not have it’s own database.
(Accept for lot’s of presets that you can keep in the program).
But this means I would have to change my workflow in that I add Keywords to my images before importing them into COP.
Although I like the program very much, this is not very appealing to me.
So…
Media Pro SE is a great program, it has no limits regarding size (as COP) and you can use it for other files than images too.
Photo Mechanic 5 is a great program, but doesn’t fit my workflow.
@mil20 :I do not want to start fiddling with the Sqlite database myself.
@BobRockefeller:
Leaves me the only option to ask PhaseOne through a feature request to please improve the performance in searching for Keywords.
Thanks for reading and responding all so far.
Jan0 -
I don't have any problem in searching for keywords in C1. It takes one or two seconds from my 17,000 images. 0
Post is closed for comments.
Comments
8 comments