← Back to team overview

bkrpr team mailing list archive

Some simple stats on book scans at Archive.org.

 

See forwarded message below -- looks like we've got our stats.  I'll try
to parse 'em out as soon as I can.

And please follow up to let me know that you got this message, as we
have apparently had some mailing list problems.  Since this list is
still small, please follow up to the whole list, that way we can make
sure sending works for you as well as receiving.

-Karl

--- Begin Message ---
Hi, Karl. If you want a mess-o bib records, the LC Books All file (to 
2006) is at:
   http://www.archive.org/details/marc_records_scriblio_net

As you probably know, you'll find those in the 300 $c, in centimeters: 
"23 cm."

There is also a dump of the OL data in JSON:
   http://www.archive.org/download/oldumps/jsondump.json.gz

That includes records from sources other than LC (other libraries, 
Amazon). That will have the size in "physical_dimensions". In addition 
to the LC height in centimeters, there you will also find some that look 
like: "11 x 9.4 x 0.7 inches." Those come generally from Amazon.

Have fun!
kc

Karl Fogel wrote:
> Hank Bromley <hank {_AT_} archive.org> writes:
>   
>> We don't really compile data on the physical sizes of our books, but I
>> can tell you that the maximum page size our own scanning stations can
>> handle (not counting the special tables for imaging "foldouts") is
>> about 9.5" x 14".
>>
>> I added Edward and Karen because either of them could probably do some
>> basic statistical analysis for you on large numbers of MARC records,
>> and the MARC records often include the book's height.
>
> Thanks!  The 9.5x14 information is useful, in that it tells me I'll need
> more data -- because that's already larger than we can probably build a
> mass-marketable scanner for :-).
>
> Edward and Karen, if you want to just send me a pointer to the MARC
> records, I'd be happy to download them myself and do the analysis.
>
> Dan also suggested asking Robert, so I'm CC'ing him.  Robert, my
> original question is below; any stats or off-the-top-of-the-head
> knowledge you have would be most welcome.
>
> Best,
> -Karl
>   
>> Karl Fogel wrote:
>>     
>>> Can you point me to the right person to ask for some stats on book
>>> sizes, based on Archive.org's book scannings?
>>>
>>> I'm working with a project (bkrpr.org) to market an affordable personal
>>> book scanner, and we're wondering what the maximum size of "common"
>>> books is.  Obviously, the bigger they build the device the more
>>> expensive it is to manufacture, so knowing that, say, less than 0.5% of
>>> all books are more than 10 inches in height or whatever would be very
>>> helpful.
>>>
>>> The Internet has some data on this (http://en.wikipedia.org/wiki/Book_size,
>>> for example), but not the kind of broad sample that Archive.org probably
>>> has...
>>>
>>> Thanks,
>>> -Karl


--- End Message ---