← Back to team overview

openerp-connector-community team mailing list archive

Re: Connector for CSV import

 

Hi all


2014/1/6 Guewen Baconnier <guewen.baconnier@xxxxxxxxxxxxxx>

> Hi,
>
> Thanks for bringing this discussion.
>
>
> On 01/03/2014 06:23 PM, Leonardo Pistone wrote:
>
>> Hi all,
>>
>> this general message is just to bring the everybody who is interested in
>> the discussion about using the connector for generic csv imports. Like
>> for e-commerce, this can bring realiability, ability to catch many
>> errors (in retrieving the file, parsing, inserting into the db etc) and
>> repeat what went wrong if necessary.
>>
>> Apart from the connector itself, there is already some excellent work in
>> progress by Akretion, at:
>>
>> * https://code.launchpad.net/~akretion-team/file-exchange/file-exchange
>> * https://code.launchpad.net/~akretion-team/+junk/logistic-center
>> * https://code.launchpad.net/~akretion-team/+junk/poc-import-data
>>
>> I gave those modules a try, and here are some points I wish to discuss.
>> First of all, well done.
>>
>> * If I inderstand correctly, we are spawning a job to import each line
>> of the file. That line is stored in the "buffer", associated to the job.
>> Still, some data needs to be imported in chunks (invoices with lines,
>> for example). That needs a kind of multi-line buffer, probably not with
>> a JSON dictionary "data" field, but something like a list of lists (i.e.
>> a table). What do you think?
>>
>
> You mean the JSON being a list instead of a dictionary? How are
> represented the data given to the load_data OpenERP methods?
>

There is two different case:

Logistic case:
We parse by ourself the file, so if we have a file with stock_move we parse
the file and group the stock_move by picking. So if I have a file that
contain 4 picking but 30 lines (so 30 stock_moves). We will group the lines
and create only 4 job. 1 job per picking

POC of importing data:
The POC is unmature, the aim was just to test some concept, and to help me
to migrate the data of my customer. The big idea is to use the native
import interface but to use the connector job in order to no waste hour and
hour when a line is wrong in the file to import.
In my case for now I only support simple file without grouping anything.

So if we want to import an invoice base on this, I will recommend for now
to process in three step.
step 1 importing a file for the invoice data.
step 2 importing a file for the invoice line data.
step 3 when everything is ok we can validate the invoice if need

But maybe the best solution for importing invoice is not to use the native
openerp import interfacebut instead building your own module base on
connector. Indeed with a custom module you will be able to handle every
case (validating a SO, cancelling it, generating the invoice....). Well not
just importing data but doing action on it also.
I think the aim of the POC is just to improve the user experience when he
want to import/export simple object like the product.

Regarding the data in the case of the POC. I always create one buffer per
line. The data which is a dictionnary is store in the JSON field. As
OpenERP load function process a list of value, when I create the buffer I
convert the list in a dictionnary (I inject the "field" information) and
than when I process the job I convert the dict into a list, so it's
compatible with the load function.


>
> On a technical point, if I can understand why a 'buffer_id' field has been
> added on the queue.job (for the UI, easy access to the buffer), I dislike
> that. A job should stay agnostic to their function, and stay as close as
> possible to jobs queues based on MQs (ideally replaceable by RabbitMQ for
> instance). Another reason is that the buffer_id is already there in the
> stored arguments for the function (duplicated data). If we wanted to have
> an access to the buffer from the jobs, I would prefer to be able to
> associate an action to a job, the action having the knowledge on how to
> parse the arguments and open the record.
>
> Yes I agree having a duplicated data, is not clean and I understand that
the job should stay agnotisc. The aim of storing the id was just to be able
to open quickly the linked buffer from the job. But it can be solve in an
other way. I think your idea of adding a parse function for getting the id
from the stored argument can be a good idea.

And can be used in many case (like product, picking... exporting). Well
every time the job is liking to an object, it will be great to have a
button to open the linked object.
How we can implement this feature?
- having a boolean (function field) that test if there is 'linked openerp
object' if yes the button 'open related object' is visible on the job?
- having a parser function, that can extract the related object from the
arguments or from the result (in case of importing on a done job)

This can be implemented in connector? What do you think?



>
>> * The file-exchange modules has the ability to move imported files to
>> the "archive" folder. Should we have also a place to move the files we
>> can't read?
>>
>
> I think this is is something that should definitely be taken in account.
> We could also need a "failures" folder that contains the files with errors
> or lines with errors (according to the atomicity of the entire file /
> lines). It can be necessary for the sender of the files to have theses
> files so it can correct them. I don't know how and if it can be handled
> though.
>

Indeed It can be necessary in some case to send a feedback on the failed
line. Maybe the good solution will be to generate a file based on the job
fail, d))epending on the job error?
In any case this should be done in a different process :
step 1 : importing file into file_document
step 2 : processing file_document
step 3 : generating a file_report for all of the error (and put the file
into a file document)
step 4 : export the file into the external system (ftp, sftp, ....)



>
> Thanks!
>
>
> --
> Guewen Baconnier
> Business Solutions Software Developer
>
> Camptocamp SA
> PSE A, CH-1015 Lausanne
> Phone: +41 21 619 10 39
> Office: +41 21 619 10 10
> http://www.camptocamp.com/
>
> --
> Mailing list: https://launchpad.net/~openerp-connector-community
> Post to     : openerp-connector-community@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openerp-connector-community
> More help   : https://help.launchpad.net/ListHelp
>

Follow ups

References