Create searchable (Images) in PDFs via the SDK (by using .Net)


good afternoon,

we're starting roll-out (pdf) document generation solution based on xsl-fo transformation xml pdf files works fine, 1 current constraint embedded images not searchable users.

now understand can in fact manually when having acrobat professional installed , using it's text recognition/ocr functionality, wondering whether same technology available through sdk.

as haven't worked sdk myself, yet, i'd know how performs when being used in .net/c#-based applications, there specific constraints, limitations environment cause?

the way want implement add (optional) post-processing step after xsl-fo transformation (which takes place on central server queue etc) perform text recognition, since create couple hundred thousand pages each day, reliability , scalability big factor. 1 document-generation machine/server creates couple documents in parallel on standard multicore machine couple gigs of ram, has tested sdk's functionality under heavy load extensively?

i double-checking because we've hit several walls in past such under-communicated limitations of 3rd party vendors....

cheers , thanks,
-jörg battermann

> wondering whether same technology available through sdk.

yes, because sdk automates acrobat. not self-contained sdk or redistributable - applications developed sdk require copy of acrobat installed on same machine in order run.

> way want implement add (optional) post-processing step after xsl-fo transformation (which takes place on central server queue etc) perform text recognition

then want @ different product, since acrobat's eula prevents being installed on server part of purely automated workflow.

> since create couple hundred thousand pages each day, reliability , scalability big factor.

again, acrobat not technically suitable type of environment.

> has tested sdk's functionality under heavy load extensively?

no, since again sdk automates acrobat, , acrobat not technically suited nor licensed type of high-availability server environment.

have contacted adobe find out if of livecycle server products suit needs? livecycle pdf generator has ocr functionality (not sure if directly suit needs - should contact adobe directly answer), know can ocr filetypes tiff, , designed work in high-availability server environment.


More discussions in Acrobat SDK


adobe

Comments

Popular posts from this blog

invalid use of void expresion in FlexiTimer2 library

error: a function-definition is not allowed here before '{' token

LED Strip Code