Replies: 4 comments 1 reply
-
If you try BerkeleyDB, I think it shall work as expected. C.C. @dk-github who is most familiar with the in-memory adapter. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Yes, the assumption in the original question doesn't hold for in-memory
backend. This fact is actually documented here:
https://docs.janusgraph.org/storage-backend/inmemorybackend/#:~:text=The%20backend%20offers%20store%2Dlevel%20locking%20only
It doesn't mean that parallel operations are impossible when using
in-memory backend - one can still successfully use in-memory backend with
massively parallel modifications - as long as these modifications are to
the "different parts" of the graph, such that only different node-local
stores are modified in parallel, and only different chunks of global
indexes are updated. In fact we run such a system where I work with no
consistency issues, by introducing a few rules on what the individual
"business" components can and cannot do, grouping updates by "keys" etc.
But of course this is not always possible.
IMO it is worth reading the whole documentation page on the in-memory
backend, as it has further information and pointers on how the
actual Januspgraph transaction commit works, what gets locked when etc, as
well as links to possible alternatives. In addition to those listed in
documentation, I believe more possible alternatives were recently
mentioned/offered for contribution - e.g. redis backend, couchbase backend
etc - but just as with all other backends, one shouldn't assume 100% ACID
properties by default - in fact "front-end" graph engine doesn't even try
all that hard to facilitate it (or it didn't, last time I looked a few
years ago). My understanding is that this is because a lot of "original"
use-cases do periodic bulk-update of the graph and then just query it for
most of the time, as opposed to OLTP-style use where updates happen in
parallel and all the time.
In summary, the suggestion is to
a) try different alternative backends based on the documentation and the
properties/capabilities of underlying store
b) try different modes of parallel modifications which could be more
compatible with "relaxed" locking the graph engine implements
c) if all else fails, try a different graph engine, which does claim ACID
properties :)
Thanks,
Dmitry
…On Wed, 6 Dec 2023 at 06:10, Boxuan Li ***@***.***> wrote:
If you try BerkeleyDB, I think it shall work as expected.
C.C. @dk-github <https://github.com/dk-github> who is most familiar with
the in-memory adapter.
—
Reply to this email directly, view it on GitHub
<#4170 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNBTWSECIVL6GVYEPAVXETYIAD6FAVCNFSM6AAAAABAGGAVQKVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TONZSGIYDI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
fmggomes
-
Also the Aerospike backend guys have a nice section about achieving
consistency when underlying store doesn't support ACID transactions, using
write-ahead log:
https://github.com/PlaytikaOSS/aerospike-janusgraph-storage-backend#:~:text=So%20we%20need%20to%20emulate%20transactional%20behaviour%20and%20not%20surprisingly%20made%20it%20via%20Write%20Ahead%20Log
…On Wed, 6 Dec 2023 at 08:53, Dmitry Kovalev ***@***.***> wrote:
Yes, the assumption in the original question doesn't hold for in-memory
backend. This fact is actually documented here:
https://docs.janusgraph.org/storage-backend/inmemorybackend/#:~:text=The%20backend%20offers%20store%2Dlevel%20locking%20only
It doesn't mean that parallel operations are impossible when using
in-memory backend - one can still successfully use in-memory backend with
massively parallel modifications - as long as these modifications are to
the "different parts" of the graph, such that only different node-local
stores are modified in parallel, and only different chunks of global
indexes are updated. In fact we run such a system where I work with no
consistency issues, by introducing a few rules on what the individual
"business" components can and cannot do, grouping updates by "keys" etc.
But of course this is not always possible.
IMO it is worth reading the whole documentation page on the in-memory
backend, as it has further information and pointers on how the
actual Januspgraph transaction commit works, what gets locked when etc, as
well as links to possible alternatives. In addition to those listed in
documentation, I believe more possible alternatives were recently
mentioned/offered for contribution - e.g. redis backend, couchbase backend
etc - but just as with all other backends, one shouldn't assume 100% ACID
properties by default - in fact "front-end" graph engine doesn't even try
all that hard to facilitate it (or it didn't, last time I looked a few
years ago). My understanding is that this is because a lot of "original"
use-cases do periodic bulk-update of the graph and then just query it for
most of the time, as opposed to OLTP-style use where updates happen in
parallel and all the time.
In summary, the suggestion is to
a) try different alternative backends based on the documentation and the
properties/capabilities of underlying store
b) try different modes of parallel modifications which could be more
compatible with "relaxed" locking the graph engine implements
c) if all else fails, try a different graph engine, which does claim ACID
properties :)
Thanks,
Dmitry
On Wed, 6 Dec 2023 at 06:10, Boxuan Li ***@***.***> wrote:
> If you try BerkeleyDB, I think it shall work as expected.
>
> C.C. @dk-github <https://github.com/dk-github> who is most familiar with
> the in-memory adapter.
>
> —
> Reply to this email directly, view it on GitHub
> <#4170 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABNBTWSECIVL6GVYEPAVXETYIAD6FAVCNFSM6AAAAABAGGAVQKVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TONZSGIYDI>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Good luck! :)
Just some quick random thoughts below, apologies if I am not making sense:
You might end up having to look at frontend as well, and the interface
between frontend and backend - the key is to understand whether what
happens there currently is enough to achieve all acid properties your
usecase cares about by providing some backend implementation... I don't
remember the details but I have some lingering memories that it probably
wasn't.
In other words, it is good to be sure whether your broken graph is caused
by deficiencies in backend implementation, or "deficiencies" e.g. in how,
when and which locks are acquired (or not acquired) by frontend, need for
something like WAL in between, etc...
With Postgres, you may be able to hack across that interface to make sure
that you are starting underlying readonly postgres transaction when
janusgraph transaction starts, and on janusgraph commit you upgrade it to
r/w transaction, do the updates and commit... That should put all the
burden of efficient locking and sequencing onto Postgres - although I am
not sure how performant that may be, depending on what sort of graph
transactions you will be having and how you translate that into postgres...
Generally, complex graphs/transactions, ACID and RDBMS table as relation
store tend to not play very effectively together - but maybe you are
solving a less generic problem, or using some hybrid schema...
I think there were a few attempts/offerings of backends based on RDBMS -
you can try searching the archives - but I am not sure if any of them has
ultimately reached any kind of maturity
…On Wed, 6 Dec 2023 at 11:06, fmggomes ***@***.***> wrote:
Thanks so much for the explanation. The reason I was testing this is
because I was writing my own storage adapter for Postgres. I had high hopes
that this way we could achieve ACID transactions, since we could use the
locking, commit and rollback from the database itself. But this didn't
work. Even though my implementation of acquireLock(...) detects collisions
that causes rollbacks and the retrying of transactions, I still get a graph
that is broken.
I will then look for inspiration in storage adapters that claim strong
"acidity" :)
—
Reply to this email directly, view it on GitHub
<#4170 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNBTWQJILQ3LTOV6RZQDDLYIBGSPAVCNFSM6AAAAABAGGAVQKVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TONZVGA2TM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
My assumption was that Janusgraph transaction handling was such that, if not used with eventual consistency storage, and retrying failed transactions, I would get the same result processing the transaction in parallel as serialized. That was not the case.
Summary of my test: my graph is a chain of 10 vertices. Nodes are identified by a property, for which a unique index was created. Each transaction upserts a node, the vertex to its left, the vertex to its right, and the edges. At the end I verify that each vertex has a out edge to the neighbor at its right. This works without parallelization, but it fails in "random" vertices when done in parallel.
I attach drawing explaining how different transaction upserts collide.
The code to reproduce (remove or add the parallel() to the stream to change between modes)
Beta Was this translation helpful? Give feedback.
All reactions