How to get data into Kepler
One main problem is how data comes from the data sources.
This entry makes a few assumptions on how data may come from the data sources, and the possible solutions are wrapped around these assumptions.
Possible solutions (with no info on how data comes from data sources):
- Read from file from local filesystem.
Have a directory which raw data files are placed, and have kepler read the files in from there.
Problems: How kepler knows there is new data to be read?
Possible Solutions/Problems:- Constantly poll the directory and check if new files are available. This would be very inefficent.
- Have a trigger to execute the kepler workflow when a new file is available. Is this even possible?
- Apparently this is possible in linux. TASK: figure out how!
- Read from file in SRB
Same as previous method except the raw data is stored in a special directory in SRB. - Have data incoming from a network socket.
This saves the problem of how to know new data is available, as the sockets can be set to wait for new data.
Problems: - Network listening actors would have to be written. This shouldn't really be a problem.