Replies: 1 comment
-
It looks like we don't have a solution for this problem in API, but i can suggest trying something like this. Providing that in your CSV data one line == one record. Sometimes record can span on several lines. If that's the case, then i'm afraid there's no workaround and we need to modify our reader val file = File("data.csv")
val header = file.useLines { it.first() }
sequence {
file.useLines { lines ->
val sb = StringBuilder()
lines
.windowed(1000, 1000, partialWindows = true)
.forEach {
sb.appendLine(header)
it.joinTo(sb, separator = "\n")
val df = DataFrame.readDelim(StringReader(sb.toString()))
sb.clear()
yield(df)
}
}
} |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently I can read a CSV to a List result as following, but for large data, it is obviously not a good option.
If possible to read Csv line by line and parse them one by one and emit the parsed result into a Flow.
Beta Was this translation helpful? Give feedback.
All reactions