Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing async #219

Open
fairhat opened this issue Jul 10, 2015 · 8 comments
Open

Parsing async #219

fairhat opened this issue Jul 10, 2015 · 8 comments

Comments

@fairhat
Copy link

fairhat commented Jul 10, 2015

Hey,

I've been using this module to convert xml files to json - split the object into several parts and save it to mongodb.
However, i am using the latest version and async:true in the config, yet the whole parsing process seems to be blocking the thread completely.

Is the async config broken or am i doing something wrong?

@Leonidas-from-XIV
Copy link
Owner

No, you're not missing anything. I thought that sax.js works in an async fashion, but turns out the underlying EventEmitter is blocking.

@fairhat
Copy link
Author

fairhat commented Oct 9, 2015

Yea, i wrapped it in an Isolated Process in a nodejs cluster. I can share the code if anyone actually needs it.

@tflanagan
Copy link
Contributor

It is certainly one method, but the concept of spawning a process (hopefully, soon, a worker) to do this is something I would +1.

I'd be interested in benchmarks on large XML datasets and the building/parsing of them. See if it's even worth it.

Having a new process do this might not be worth it until workers are available (to cut down on IPC latency/mem usage.)

Edit: Spelling / Grammar.

@fairhat
Copy link
Author

fairhat commented Nov 11, 2015

Well, if your xml-parsing function is tied to an api-server, it is kind of always worth it, no matter how "fast" it is. We were parsing 50MB+ XML files and the processing time was mostly 15-500+ seconds, blocking the whole api while working.

@tflanagan
Copy link
Contributor

500+ seconds?! Even with a new process to not block other api calls that would kill any front end usability.

Sounds like you should try to chunk those down to smaller XML docs.

@fairhat
Copy link
Author

fairhat commented Nov 11, 2015

Well, we're parsing the xml, splitting the json into smaller objects, throwing the data to mongodb and after that, the app doesn't need to be updated unless the XML changes, which is like twice per month on average (depending on our client).
The changes are small anyways, so they can live with the cached version until the update is finished.
So the client actually just clicks on "update" and forget's about it until it is finished (and he gets a notification).
Not saying it's the greatest way to deal with that, it's just that I don't have much of a choice changing the XML-Files. :-|

Edit: I should mention the XML docs are generated with some really old enterprise software.

@wmelton
Copy link

wmelton commented Mar 28, 2021

@fairhat not sure if this will solve your exact problem, but using native Promises with async/await might be the solution you need. See below:

let result = await new Promise((resolve, reject) =>
	parser.parseString(xml, (err, result) => {
		if (err) reject(err);
		else resolve(result);
	})
);
@fairhat
Copy link
Author

fairhat commented Mar 28, 2021

@fairhat not sure if this will solve your exact problem, but using native Promises with async/await might be the solution you need. See below:


let result = await new Promise((resolve, reject) =>

	parser.parseString(xml, (err, result) => {

		if (err) reject(err);

		else resolve(result);

	})

);

Hello wmelton, thanks for your suggestion. The issue is 5 years old and i have changed my job since and the problem wasn't related to Promises/async/await syntax but that even when you're using promises, the operation was still blocking the main thread.

Not sure if the repo was updated since

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants