I'm now a couple of weeks into my node deep dive. I've learned a lot from Anthony's excellent course on udemy and I'm currently going through a book " nodejs the right way". I've also gone through quite a few articles that brought up some very good points about real world scenarios with node and coupling other technologies out there.
HOWEVER, it seems to be accepted as law, that you don't perform computationally heavy tasks with Node as its a single thread architecture. I get the idea of the event loop and asynch callbacks etc. In fact nodes strength stems from tons of concurrent IO connections if I understand correctly. No matter where I'm reading though, the source warns against hanging up that thread executing a task. I can't seem to find any rule of thumb of things to avoid using nodes process for. I've seen a solution saying that node should pass computationally heavy tasks to a message service like RabbitMQ which a dedicated app server can churn through(any suggestions on what pairs well with node for this task? I read something about an N-tier architecture). The reason I'm so confused is because I see node being used for reading and writing files to highlight the usage of streams but in my mind fetching/reading/writing files is an expensive task (I feel mistaken).
Tl;Dr What kind of tasks should node pass off to a work horse server ? What material can I read that explains the paradigm in detail?
Edit: it seems like my lack of understanding stemmed from not knowing what would even halt a thread in the first place outside of an obviously synchronous IO request . So if I understand correctly reading and writing data is IO where mutating said data or doing mathematical computations is computationally expensive (at varying levels depending on the task of course ) . Thanks for all the answers!
If you're using node.js as a server, then running a long running synchronous computational task ties up the one thread and during that computation, your server is non-responsive to other requests. That's generally a bad situation for servers.
So, the general design principles for node.js server design is this:
Use only asynchronous I/O functions. For example, use
If you have a computationally intense operation, then move it to a child process. If you do a lot of these, then have several child processes that can process these long running operations. This keeps your main thread free so it can be responsive to I/O requests.
If you want to increase the overall scalability and responsiveness of your server, you can implement clustering with a server process per CPU. This isn't really a substitute for #2 above, but can also improve scalability and responsiveness.
The reason I'm so confused is because I see node being used for reading and writing files to highlight the usage of streams but in my mind fetching/reading/writing files is an expensive task (I feel mistaken).
If you use the asynchronous versions of the I/O functions, then read/writing from the disk does not block the main JS thread as they present an asynchronous interface and the main thread can do other things while the system is fetching data from the disk.
What kind of tasks should node pass off to a work horse server ?
It depends a bit on the server load that you are trying to support, what you're asking it to do and your tolerance for responsiveness delays. The higher the load you're aiming for, then the more you need to get any computationally intensive task off the main JS thread and into some other process. At a medium number of long running transactions and a modest server load, you may just be able to use clustering to reach your scalability and responsiveness goal, but at some threshold of either length of the transaction or the load you're trying to support, you have to get the computationally intensive stuff out of the main JS thread.