-
Notifications
You must be signed in to change notification settings - Fork 26
Chunks
As the basis for all information encoding in Drasil, chunks have become an integral part of allowing us to use and maintain the current database of knowledge. At its core, a chunk is a data type specialized in holding a specific type of information for a specific purpose. For example, NamedChunks
are often used for objects that have a unique identifier and an associated term. ConceptChunks
mirror real-world concepts by including the idea, definition, and domain for a particular concept. Something like a QuantityDict
can have an idea, the space in which it exists, units and a symbol. Many other chunks exist within Drasil that allow the program to hold the required information and its meaning so that knowledge may be used in generated models, definitions, and theories.
Chunks are usually made up of lower-level types with different purposes. A chunk whose purpose is to hold all the information needed for a mathematical variable would need a symbol, description/definition, and units (as shown below). This particular example gives a name to the concept which is built from a quantity and its units. The structure of a chunk can be thought of as a wrapper of sorts. It encases only the necessary information to perform its job, but its contents may be unwrapped and used one at a time. The wrapper itself may be wrapped again with more things added to it (like an abbreviation or a domain). This is primarily how one idea can be built upon in Drasil.
So, how do we represent this in code? Conveniently, we can use Haskell's record-type syntax along with lenses to define, set, and get the information we need from within the chunk wrapper. This way, we can wrap wrappers without worrying about the "level" of wrapping around one particular identifier. Using this, one UID
can be represented in a hierarchy of chunks, with no information loss when upgrading to a larger chunk. A straightforward example of this is the progression from a lower-levelled NamedChunk
to something much larger like a TheoryModel
. One of the smallest chunks (NamedChunk
) is defined as follows:
data NamedChunk = NC {_uu :: UID, _np :: NP}
It contains a unique identifier (UID
) and a term that can be used in creating sentences (as a noun phrase, NP
). As of now, we don't know what this NamedChunk
is or what it can do, but we do know that it exists and we can use it in a sentence with proper pluralization and capitalization. Most likely, these chunks will be common nouns that are significant enough to have a name. Two NamedChunks
may also be combined to produce a new NamedChunk
that carries both of their terms. We can start to define single words and simple ideas like table_
and symbol
and then combine those to make a tableOfSymbol
NamedChunk
idea, which is more complex. Using the wrapper analogy, we unwrap the term from table_
and symbol
, then rewrap them after placing an "of" between them to get a tableOfSymbol
chunk.
A NamedChunk
can either be used as a method for getting a defined term or build upon. The "next step" up from a NamedChunk
is an IdeaDict
, which contains a NamedChunk
and maybe an abbreviation. We can see the direct progress in its type definition:
data IdeaDict = IdeaDict { _nc' :: NamedChunk, mabbr :: Maybe String }
As we continue to learn more about what exactly we want this chunk to represent, we can gain more specifics about the idea and directly create a richer type to work with such information. From this point, there are many options available to continue adding information. If the idea should be made into a concept, we can use a ConceptChunk
to wrap the idea along with a definition and its domain:
data ConceptChunk = ConDict { _idea :: IdeaDict -- ^ Contains the idea of the concept.
, _defn' :: Sentence -- ^ The definition of the concept.
, cdom' :: [UID] -- ^ UID of the domain of the concept.
}
If we know the concept is a quantity or can be treated as one, it may become a QuantityDict
or DefinedQuantityDict
:
data DefinedQuantityDict = DQD { _con :: ConceptChunk
, _symb :: Stage -> Symbol
, _spa :: Space
, _unit' :: Maybe UnitDefn
}
By continuously wrapping the information needed, we can successfully encode relevant knowledge in a useful and practical manner.
Eventually, we build up relevant chunks through seeing common patterns in examples and actual documentation. We have various high-level chunks dedicated to units (UnitDefn
, UnitaryConceptDict
, UnitaryChunk
, UnitalChunk
), relations (RelationConcept
), quantities (QuantityDict
, DefinedQuantityDict
), uncertainties (UncertainChunk
, UncertQ
), and much more. Our foundation of knowledge is built upon these chunks, and the strong typing of Haskell really emphasizes the semantic meaning that should be associated to each type. As Drasil grows, more and more chunks will be added with different chunk types, thereby allowing our database of knowledge to grow alongside it. For more information on the chunks currently available in Drasil, please see the Haddock documentation.
This section contains a list of the chunks currently defined in drasil-lang
(as of August 3, 2021), along with a short description for each of them.
Chunk Name | Description | Example |
---|---|---|
ConceptChunk |
Used to make a concept that has a term and definition. It may also be tagged with some domain of knowledge. | The concept of "Accuracy" may be defined as the quality or state of being correct or precise. |
CommonConcept |
Similar to a ConceptChunk , but it must have an abbreviation. Not used widely across Drasil. |
"HGHC" is defined as dcc' "hghc" (cn "HGHC") "HGHC program" "HGHC" . |
ConceptInstance |
Used for a concept that can be referred to. Often used in Goal Statements, Assumptions, Requirements, etc. | A concept that we would want to reference back to. Something like the assumption that gravity is 9.81 m/s. When we write our equations, we can then link this assumption so that we do not have to derive that assumption to verify our work. |
Citation |
A citation refers to other people's work. In Drasil, the reference address of a citation becomes the UID for that citation. It also contains other necessary information such as the kind of citation and citation fields. | A reference to a thesis paper like Koothoor's "Document driven approach to certifying scientific computing software" would include the affiliated university, publishing year, and city. |
CI |
A common idea is something that is worth naming, similar to a NamedChunk . However, it also includes an abbreviation and the domains of knowledge in which it appears. |
The term "Operating System" has the abbreviation "OS" and comes from the domain of computer science. |
ConstrainedChunk |
These are symbolic quantities with some constraints and maybe a reasonable value. | Measuring the length of a pendulum would have some reasonable value (between 1 cm and 2 m) and the constraint that the length cannot be a negative value. |
ConstrConcept |
Similar to a ConstrainedChunk , but is instead built off of a DefinedQuantityDict . This means that the value also has a definition an associated domain of knowledge. |
We could use a similar example to the one for ConstrainedChunk , except we would know the definition of a pendulum arm and its domain (physics). |
DefinedQuantityDict |
For when we want to assign a quantity to a concept. Includes the space, symbol, and units for that quantity. | A pendulum arm can be defined as a concept with a symbol (l), space (Real numbers), and units (cm, m, etc.). |
QDefinition |
Building off of a QuantityDict , we now have a defining expression with inputs, a definition, and a domain. Used to make definitions and models. |
Finding the velocity of a pendulum arm through a QDefinition would entail an equation to find velocity and input values. |
NamedArgument |
This chunk type is a wrapper for a QuantityDict , but used more for generating code and ODEs. |
Can be used to define inline arguments in generated code. |
NamedChunk |
One of the lowest-level chunks. Used for anything worth naming, only contains a term and its UID. | A pendulum arm will start out by being named as such, before we can add any values or equations to it. |
IdeaDict |
It is simply a NamedChunk that could have an abbreviation (similar to CI but may not necessarily need an abbreviation and does not have a domain). |
The project name "Double Pendulum" may have the abbreviation "DblPendulum". |
QuantityDict |
In a similar way to DefinedQuantityDict , this chunk adds a space, symbol and units. However, the information may not necessarily be a concept, but rather anything that is named through an IdeaDict . |
A pendulum arm does not necessarily have to be defined as a concept before we assign a space (Real numbers), a symbol (l), or units (cm, m, etc.). |
RelationConcept |
These are for concepts that may have an associated expression. Used often in creating definitions and models. | We can describe a pendulum arm and then apply an associated equation so that we know its behaviour. |
It can be quite difficult to see the dependencies of each chunk, so making graphs and data tables (by running make analysis
) can help us to fine-tune which chunks should exist and which chunks need to be modified.
These are all very important aspects needed to keep programs relevant and usable. This also means that Drasil should be able to adapt to new knowledge while still holding on to older information. As users input knowledge needed to complete their goals or projects, Drasil should be able to absorb information and consistently generate reliable artifacts dependent on that information. Of course, there will be many steps in between giving Drasil information and it giving back meaningful documentation, but the idea of Drasil constantly gaining knowledge should be present any time we choose to work with it.
For example, a concept can be stored in a ConceptChunk
, which holds the unique identifier, term, maybe an abbreviation, a definition, and a domain for a real-world concept in physics, mathematics, or computer science.
- Home
- Getting Started
- Documentation (of Drasil specifics)
- Design
-
Readings
- Drasil Papers and Documents
- Related Work and Inspiration
- Writing Documentation
- Compression as a Means to Learn Structure
- Glossary, Taxonomy, Ontology
- Grounded Theory
- Model Driven Scrapbook
- Model Transformation Languages
- ODE Definitions
- The Code Generator
- Suggested Reading
- Sustainability
- Productivity
- Reuse
- Formal Concept Analysis
- Generative Programming
- Software Documentation
- Units and Quantities
- Misc.
- WIP Projects