-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC] Parallel-Safe Collections Module and Distributed Data Structures #7062
[GSoC] Parallel-Safe Collections Module and Distributed Data Structures #7062
Conversation
Both GSoC mentors |
Two things for the PR message:
|
Gating Issues: #7003 as it produces undefined behavior when LICM is enabled and the return value of #6542 as it produces a compiler segmentation fault when the return value is ignored/not captured at all. Hence meaning that using |
Thanks @mppf I have done both. Should I rename the PR to 'Parallel-Safe Collections Module'? |
@LouisJenkinsCS - yes I think that renaming is a good idea. Thanks! |
Thanks Louis! I discussed with @mppf offline and will give this an initial review a little later today. Meanwhile, I don't know whether you have signed a CLA yet. If not, maybe you can take a look at this and get it out of the way while waiting for reviews: https://github.com/chapel-lang/chapel/tree/master/doc/rst/developer/contributorAgreements |
We already have a contributor agreement for Louis. |
Would also like to make known my other question on a potential revision, here, on how I would like to remove 'freeze' logic as its no longer needed like originally planned and comes at a penalty to performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished a full review of the PR. My comments are mostly trivial changes sprinkled with some questions to the reviewer from the team
modules/standard/Collection.chpl
Outdated
Adds an element to this data structure. | ||
*/ | ||
inline proc add(elt : eltType) : bool { | ||
halt("'proc add(elts : eltType ... ?nElts) : bool' is not supported..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error message is seemingly from an older version of add
modules/standard/Collection.chpl
Outdated
|
||
**FIX:** Use the `--no-loop-invariant-code-motion` to disable LICM. | ||
Otherwise, just make sure you always capture the return value inside of a loop | ||
in a variable not declared outside of loop... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think linking to github issues can be useful for some users who wants to know more about the bug/or state their concerns regarding the issue.
modules/standard/Collection.chpl
Outdated
/* | ||
Obtain the number of elements contained in this collection. | ||
*/ | ||
inline proc size() : int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To bring the collections more inline with the existing List
module, I think we can remove the parentheses and maybe add a synonym length
? Arguably, that module needs to be incorporated into this framework but until then let's not diverge too much from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parenthesis methods cannot be overloaded, but having a synonym for 'length' seems spot on since 'length' can just call 'size()'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right but you could create a proc Collection.size { return this.getSize(); }
e.g.
modules/standard/Collection.chpl
Outdated
Iterate over all elements in the data structure. | ||
|
||
**BUG:** Compiler does not currently allow overloading standalone or leader/follower | ||
iterators, and as such only serial iterators may be used with the base type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly a link to github issue here
modules/standard/DistributedBag.chpl
Outdated
private const ADD_AVERAGE_CASE = 1; | ||
private const REMOVE_BEST_CASE = 2; | ||
private const REMOVE_AVERAGE_CASE = 3; | ||
private const REMOVE_WORST_CASE = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these consts looks like they could be params. Also types should be omitted unless there is an issue making them necessary (i.e. is it a problem if STATUS_* becomes an int with type inference?)
// gets added, as well as a barrier to prevent the head and tail from being cached. | ||
var size : atomic int; | ||
|
||
inline proc recycleNode() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd find createNode
more appropriate but it is more personal taste
pragma "no doc" | ||
inline proc getPrivatizedThis { | ||
return chpl_getPrivatizedCopy(this.type, pid); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly privatization methods should be marked with TODO or FIXME or something
*/ | ||
proc pop() : (bool, eltType) { | ||
return popBack(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think push/pop should use front instead of back, but I don't know if that makes any practical difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Enqueue
pushes to the back, I'd expect that if I did enqueue(); pop();
to get the same thing I just added, hence I have both use the same end.
nodes[idx] = (size, tailIdx, node); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe names FIFO/LIFO can be clearer.. iterFIFO
/iterLIFO
or something.. or even iter iterator(param order: orderKind
where orderKind
is an enum or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was thinking of having these(param order : orderKind = orderKind.NONE)
, but I wasn't sure. That way I'd have orderKind.NONE, orderKind.FIFO, orderKind.LIFO
, etc. As well, since enums no longer causes a compiler internal error with deadCodeElimination (fixed in #6570), I have no issue with enums anymore.
@@ -0,0 +1,2 @@ | |||
-sisBounded --main-module=DequeParity | |||
--main-module=DequeParity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a need for --main-module flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, that was from before, but I do need it to run both with and withou t-sisBounded
set, and I wasn't sure if blank lines get counted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do:
-sisBounded=true
-sisBounded=false
modules/standard/Collection.chpl
Outdated
} | ||
|
||
*/ | ||
inline proc remove() : (bool, eltType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm used to seeing remove calls refer to a specific element - where the pattern here is more what I'd probably call pop
. Is there a particular reason you chose add/remove vs. push/pop? Any precedent?
Of course it's the case that we don't want to imply that all of these Collections meet some particular definition of Stack ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No-inline here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I didn't want to imply any particular ordering, and add
and remove
seemed descriptive enough, as remove
in this case means "remove any item".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ...
modules/standard/Collection.chpl
Outdated
/* | ||
Adds an element to this data structure. | ||
*/ | ||
inline proc add(elt : eltType) : bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't expect virtual methods that just halt should be marked inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning behind declaring it inline
is due so that the user will know the precisely location they have invoked the unsupported method. Normally, if it is not inlined, it shows a line number in Collections.chpl, which isn't entirely helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ...
modules/standard/Collection.chpl
Outdated
/* | ||
Determine whether an element exists in this collection. | ||
*/ | ||
inline proc contains(elt : eltType) : bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no-inline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ...
modules/standard/Collection.chpl
Outdated
/* | ||
Obtain the number of elements contained in this collection. | ||
*/ | ||
inline proc getSize() : int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no inline here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ...
modules/standard/Collection.chpl
Outdated
As such, this cannot be used where `ref` intents cannot be used, such as ``forall`` | ||
and ``coforall`` loops. | ||
*/ | ||
proc +=(ref c : Collection(?eltType), elt : eltType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one seems more reasonable to inline, if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ...
modules/standard/Collection.chpl
Outdated
@@ -0,0 +1,116 @@ | |||
/* | |||
A 'Collection' is a data structure, a container for elements that provide support | |||
for insert, lookup, remove, and iteration operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the number of open issues / compiler bugs here, I'd like to recommend that this module go into modules/packages. It is used in the same way as it would be if it went in to standard/. My expectation would be that once some of these user-facing issues that limit its usefulness are addressed, it can move to standard/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and if it wasn't clear, we'd want to put all of your new modules in either standard/ or packages/ together)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no objections to that. I placed it in standard/ because the List
module was also in standard/.
modules/standard/DistributedBag.chpl
Outdated
or, for example a near empty bag, we fall into the 'average-case', etc. Examples | ||
of 'best-case' scenarios for a removal would be when a segment is unlocked | ||
and contains some elements we can drain, and the 'average-case' would be to find | ||
any segment, unlocked or not, that contains elements we can drain, and so on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried to use chpldoc to render the documentation for your new modules? (While that's a good idea, it's not necessarily required). In this case in particular, there's usually a comment at the top of the module that offers a high level introduction. If that's what this comment is, it's too detailed. I'm not objecting to this comment's existence - I'm just saying that we'd need an even more basic comment (What is this and why would I use it?) before any comment that goes into as much detail about performance as this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation (using chpldoc) can be seen here. I normally left the high-level description for the main structure itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link - I had forgotten that you'd already generated that.
Right, but it's good for such documentation to start with an introductory paragraph. For this reason, we typically explicitly declare the module (e.g. module DistributedBag { }
) and attach comments to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That explains why its a huge deal for the classes to not share the same name as the module itself. I'll make that change.
modules/standard/DistributedBag.chpl
Outdated
stealing work, and better locality of elements among segments. As well, to | ||
achieve true parallelism, usage of a privatized instance is a requirement, as | ||
it avoids the overhead of remotely accessing class fields, bounding performance | ||
on communication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the array implementation (say), the priviatized instances are handled transparently by the module code. I think that doing so in your case would amount to:
- creating a record wrapping these types
- storing the privatized ID a well as the class pointer in the record
- creating forwarding methods from the record to the class instance / privatized class
I can see that you've been making use of the class system in Chapel, so that Bag and Deque are subclasses of Collection. I can think of several reasons why you might have wanted to do that, but my personal expectation would be that Collection would be an "interface" (which does not exist yet in Chapel) and that Bag and Deque are implementations of that interface. I bring this up because if you do the record-wrapping as I describe above, one would no longer be able to store a mixed-type array of Collections that are either Bag or Deque. But I don't think doing so is actually useful, in this particular case.
So, if you were to follow the array strategy:
- Collection, DistributedBag, DistributedDeque would become CollectionImpl, DistributedBagImpl, DistributedDequeImpl
- The new records wrapping these would be called DistributedBag and DistributedDeque (or something, you could consider using lower case to name them for example)
- Users would no longer have to manually
delete
or manage privatized instances
Lastly, I'm noting that this module creates a class with the same name as the module. We need to find a way not to do that - Chapel is a single namespace language - so one namespace needs to be able to identify both the module and the class - otherwise we get some weird compilation problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I can agree that record wrapping is a very good idea, it wouldn't allow me to have any kind of inheritance (issue #6851). Not only would you no longer be able to store mixed-type arrays, you wouldn't be able to store any kind of Collection
as you can no longer inherit, and even if you could you wouldn't be able to use the Collection
in an opaque way (since before removal, they would truncate all methods and fields to the base type).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, there is an issue with using records in a forall
or coforall
loop. Example below
record bag {
var someData : atomic int;
proc add(elt) { someData.add(elt); }
}
var b : bag;
forall i in 1 .. 100 {
b.add(i);
}
writeln(b);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use a Collection in an opaque way with generic programming. I'm not sure that supporting mixed-type arrays of Collection is worth the trade-off of requiring users to call privatization functions manually.
I'm having trouble interpreting this:
you wouldn't be able to store any kind of
Collection
as you can no longer inherit
you wouldn't be able to use theCollection
in an opaque way (since before removal, they would truncate all methods and fields to the base type).
What does inheriting have to do with storing a Collection? What do you mean by removal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, you wouldn't be able to do this...
proc addSomeData(c : Collection) {
c.add(1);
c.add(2);
c.add(3);
}
var bag = new DistributedBag(int);
addSomeData(bag);
You would not be able to use it that way at all. You can't say "Give me any collection so I can add elements to it", which I believed was one of the main key-points to using it. Unless, you're saying that we have it like this...
proc addSomeData(c : Collection(?implType));
Where Collection
is generic on the implementation? Can you provide an example of how you envision it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it makes more sense to me to use OwnedObject either directly or as inspiration. Generally speaking, we've been moving away from reference counting things in preference to requiring that the user not delete them before they go out of scope. I think if that strategy is good enough for language-supported arrays, it's good enough for Bag etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lastly, even if you (say) replaced class DistributedBag
with record DistributedBag
, that would probably cause problems with the privatization code. In order to privatize you probably do need record DistributedBag
and class DistributedBagImpl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure, this is an affirmative on committing to record wrapping. I'll begin working on these changes in a new branch immediately (stop me if I assumed incorrectly).
Also, looking at the OwnedObject
module, doesn't it seem more than a bit... restrictive? Looking at the implementation, all it does is disallow the user from using the object in certain contexts using pragmas. SharedObject
, reference counting or not, allows the user to make use of the all data structures as if they are classes, which I strongly believe is the way to go about this. If privatization support improves for classes, then this will not be an issue in the future and since this becomes a 'module/package' it doesn't have to strictly follow the 'standard'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking that with record wrapping, I might be able to do this without requiring explicit privatization (perhaps this might be best?). What if I just create an array of DistributedBagImpl
in the record and have each Locale pick which to use based on that? The only issue is that this type should almost never be used as a wide ref
or else it incurs the cost we're trying to avoid (hence the need for SharedObject
). Speaking of SharedObject
, I think I can make it work by just making a SharedObject
hold on to some helper class that has the sole purpose of destroying the DistributedBag
fields when it gets destroyed.
Furthermore, with all this indirection, I can probably proceed this way, and it probably is best in general. I still have concerns based on how to ensure that as a record it never is used by wide reference (is there a pragma for that?).
As you work towards removing the inheritance, just make Collection exist for documentation purposes
Would it be possible to make it into a CHIP instead? However, if we were to keep the inheritance with the implementations to make Collection
relevant, the issue would next come down to forwarding. If I want to forward only on the privatized instance, is this currently allowed? Etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd lean towards mimicking the pattern of record wrapping done by arrays. That would not necessarily mean using OwnedObject. In particular, the main point there is that when the record is destroyed, the Impl instance is deleted, and there's no provision for keeping the instance alive longer if there are record copies of it. In this way it is similar to using a bare class.
It's probably not currently a good idea to try to use OwnedObject for this purpose, as OwnedObject has some issues (to do with lvalue checking and compiler errors at least) that you could easily avoid just by writing your own constructor / destructor.
If you created an array of DistributedBagImpl, you'd still need privatization / replication in that array to avoid communication when getting the local instance ID. I think it's reasonable to use the privatization mechanism, but having a Replicated array field (say) is a pretty reasonable alternative.
About not wanting to use the record as a wide ref - Ben H. recently committed a mechanism that can be used to serialize/deserialize instances. Obviously it might be more than you're able to do in your last week, but you could try to use that to control what happens when your data structure is used on the other side of an on
statement.
I'm not sure what you mean by forwarding... "If I want to forward only on the privatized instance, is this currently allowed?" I think that's what the array record does, depending on exactly what you mean. Might be best to find me on irc to ask about these.
modules/standard/DistributedBag.chpl
Outdated
there are larger numbers of elements. The better the locality, the better raw | ||
performance and easier it is to redistribute work. | ||
*/ | ||
config param distributedBagInitialBlockSize = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering - is there a particular reason these need to be compile time constants? Certainly I can think of reasons we might prefer to hide them from the -help output, but I'm having trouble imagining that such configuration settings couldn't be set at launch time? (I.e. config const).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember originally it had to do with using tuples (which is no longer the case), I suppose that slipped my mind. I'll make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ...
modules/standard/DistributedBag.chpl
Outdated
we allocate to increase raw parallelism, and the larger the workload the better | ||
locality (see `distributedBagInitialBlockSize`). This data structure is unordered and employs | ||
its own work-stealing algorithm, and provides a means to obtain a privatized instance of | ||
the data structure for maximized performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph is a pretty good introduction. For full-on user documentation you'd also need to include an example of using it.
modules/standard/DistributedBag.chpl
Outdated
*/ | ||
proc getSize() : int { | ||
var sz : atomic int; | ||
forall loc in targetLocales do on loc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a coforall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ...
modules/standard/DistributedBag.chpl
Outdated
*/ | ||
proc contains(elt : eltType) : bool { | ||
var foundElt : atomic bool; | ||
forall elem in getPrivatizedThis { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I havn't dug in to the these call below, but I would've expected this one to be written with somehting like
coforall loc in targetLocales do on loc {
forall elem in getPrivatizedThis.localElements() {
if elem == elt {
foundElt.write(true);
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does sound like a good idea, having each Bag
implement some parallel and serial iterator to reduce the amount of redundant code.
modules/standard/DistributedBag.chpl
Outdated
distribution across majority of nodes, but this should be called when you have | ||
a severe imbalance, or when you have a smaller number of elements to balance. | ||
Furthermore, while this operation is parallel-safe, it should be called in a | ||
sequential context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think - but am not sure - that what you mean by "it should be called in a sequential context" is that it should only be called from one task at a time - since it redistributes the entire data structure and uses on statements etc to handle data elsewhere. I don't think "should be called in a sequential context" means that for me. I'd probably just say "Furthermore, while this operation is parallel-safe, it's only really useful to call it from one task at a time, since that call will update the entire distributed data structure".
modules/standard/DistributedBag.chpl
Outdated
|
||
// Allocate buffer, which holds the 'excess' elements for redistribution. | ||
// Then fill it. | ||
var buffer = c_malloc(eltType, excess); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use c_malloc
instead of just a normal Chapel array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently I have a helper method that takes both a c_ptr, a size, and a locale and writes into it, and one that does the opposite (reads from a c_ptr). I chose c_ptr over chapel arrays as they are more lightweight, but I didn't know about the bulk transfer support you mentioned before.
modules/standard/DistributedBag.chpl
Outdated
var give = average - nElems; | ||
var arr : [1..give] eltType; | ||
on bufferOffset { | ||
var tmpBuffer = buffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If buffer were also a Chapel array, you'd write this transfer in a way that the Chapel compiler can optimize into a small number of PUTs by using array slices. E.g.
arr[1..give] = buffer[bufferOffset..bufferOffset+give-1];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this situation, I'll give it to you, that does look like it'd save me a ton of logic and triple checking, I'll see about using that, and in particular, using slices like that.
modules/standard/DistributedBag.chpl
Outdated
if average > nElems { | ||
var give = average - nElems; | ||
var arr : [1..give] eltType; | ||
on bufferOffset { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure you meant on Locales[bufferOffset] or something like that - as it is, on bufferOffset will probably be a no-op since bufferOffset is running on the current locale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call on bufferOffset
because bufferOffset is located on the node that holds the buffer
, which is the node that called balance
. On line 341, you see that we do a for loc in Locales do on loc
. We have to jump to the locale owning the buffer
because its a C_Ptr.
modules/standard/DistributedBag.chpl
Outdated
} | ||
|
||
// Phase 3: Release all locks from first node and segment to last node and segment. | ||
for loc in localThis.targetLocales do on loc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this need to happen in the reverse of the acquire order to prevent some deadlock case? Or am I just being paranoid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we hold the locks on all segments, so we don't need to bother checking to see if they are locked or not, we just unlock them. We start by locking the first node segment, then continue on to the last, so we unlock in the same way. Also should be noted that we use atomic spinlocks over sync
variables.
modules/standard/DistributedBag.chpl
Outdated
for loc in targetLocales { | ||
for segmentIdx in 0..#here.maxTaskPar { | ||
// The size of the snapshot is only known once we have the lock. | ||
// BUG: c_ptr does not seem to work here and causes a segmentation fault at runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well yeah... c_ptr is always local and doesn't support use from another locale. _ddata would do that, but it's almost certainly better to just use a Chapel array in the first place.
modules/standard/DistributedBag.chpl
Outdated
|
||
inline proc push(elt : eltType) { | ||
if elems == nil { | ||
halt("DistributedBag Internal Error: 'elems' is nil"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have assert
too, which might be more succinct for errors like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue I've had with assert is that for some reason I can't seem to find the source of an assertion, normally it shows a line number that points into the Assert module. Furthermore, it doesn't seem to allow me to provide an error message for failing an assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, it's not worth too much trouble at this point. However we could certainly add error messages for asserts. And for the line numbers, try CHPL_UNWIND=libunwind and see where that gets you. Or compile without CHPL_DEVELOPER=1.
I ran some testing in the standard (local) configuration and get a failure for CollectionNQueens:
|
Did you change the number of queens for NQueens? As well, which data structure. Its possible to naturally reach 1M spins (was an arbitrary upper limit). |
After running it locally and inspecting it, the |
After testing, it seems that this issue only arises on |
Passed full local testing except for CollectionNQueens. |
It turns out the issue was much simpler than expected. When I was recycling the node, I forgot to sanitize it, so I was reusing the old values as well as the old headIdx and tailIdx, meaning it would be reusing data from the old operations. Now that I sanitize it (and have reran the benchmark 100 times on swan), I confident that it works now. Phew, had me nervous. @mppf I'll push out fix soon so you can test it tomorrow. |
…path. Need to find a way to put Collections in their own folder.
…ed issue where 'DistributedDeque' improperly reused the cached unroll block before cleaning it, resulting in wonky behavior
… we spawn too many tasks. NUMLOCALES dropped to 4 to help 'fifo' tasking layer, and made changes from 'barrier' to 'barriers'
ba83ff5
to
1f0fff2
Compare
Ready to merge. Had to rebase and force push my changes so that I could run with new |
Passes new test directory with:
|
/* | ||
Pass 2: Average Case | ||
|
||
Find any segment (locked or unlocked) and make an attempt to acquire it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "Find any segment" but the average case code only looks in 1 segment. Is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's outdated documentation. Originally we checked the status of the segment before trying to add to it (back when I had work stealing keep the lock on the segment, it would just look for an adjacent segment which wasn't). I can make a OR correcting it if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, has there been a reported issue so far with the collections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I havn't seen one, I'd let you know. I think we're still waiting for somebody to battle-test them.
You can fix the comment if you like, no big deal.
@mppf I've got some potentially good news! I've proposed a change to my independent study advisor to redirect it towards research towards Chapel! (As I said potentially, its not a given yet) If its approved, I'll be doing a lot more work on either Collections framework or the GlobalDescriptorTable/GlobalAtomicObject as my new research topic! (Also I got a small undergraduate research grant, woo!) |
Nevermind, gotta commit to current topic. I wonder if I could talk to someone on the team who is proficient enough with the LLVM backend to help me out a bit. |
Collections
I present a parallel-safe distributed data structures framework, similar to Java's 'Collection' framework. Original discussion can be seen here. Original repository containing benchmarks can be found here. Chpldocs documentation can be seen here.
Note: This is the official PR for the GSoC program, but I do realize it may seem a bit like a rough-draft. Any feedback or constructive criticism would be appreciated it.
Deque
Provides a strict ordering without sacrificing too much performance. Supports insertion
and removal from both ends, allowing FIFO, LIFO, and a Total ordering, which is
preserved across all nodes in a cluster, and employs a wait-free round-robin approach
to load distribution that ensures fairness in memory, bandwidth, and computation.
Disclaimer: The deque provided, while scalable, rely heavily on network atomics.
The benchmark results are produced using said network atomic operations.
Bag
With performance that scales both in the number of nodes in a cluster and the
number of cores per node, we offer a multiset implementation, called a 'Bag',
which is a medium that allows storing and retrieving data in any arbitrary order.
This type of data structure is ideal for work queues as it employs it's own load
balancing, and offers unparalleled performance.
Disclaimer: A node can request a 'privatized' copy, which retrieves a clone
that is allocated on the node requesting it, reducing any excess communication.
Usage of
getPrivatizedInstance()
is highly advised for performance-criticalsections.
Performance
We compare our data structures to a naive synchronized list implementation
as that is all that is available in the Chapel standard library.
In all cases, the data structures scale and outperform the naive implementation.
Insert
Remove
Revisions