Stata module/package to generate a publication-quality subject disposition flow diagram in LaTeX using the PGF/TikZ package directly within Stata using texdoc
Fully tested on Stata versions 13, 14, and 15
Use the flowchart package in Stata to generate a publication-quality Subject Disposition Flowchart Diagram in LaTeX Format. This package gives Stata the ability to generate the necessary TikZ code to include in a LaTeX and produce the diagram as a PDF or any other format. Similar to how the 'estout' package in Stata is commonly used to generate tables from regression analysis output for published journal articles, this package can be used to generate flowcharts for publications that show how the subjects from a study were included/excluded into analysis groups, or further refinements.
The final diagram will be similar in style to the ones used in the PRISMA Statement, CONSORT 2010 Statement, or STROBE Statement Reporting Guidelines, which are very commonly used within formal publications for systematic review, clinical trial, or cohort study research findings. This package allows generating this diagram to be automated so that the numbers in the diagram change as the analysis changes, saving hours of work.
For an example of the package's output, please see example1output.pdf and the example code that produced it in flowchart_example1.do.
Here is a low-resolution screenshot of the PDF:
The format follows closely the example of a CONSORT-style flow diagram at TeXample which was written in PGF/TikZ by Morten Willert. The example code to generate the above diagram is incldued in the Ancillary Files installed with flowchart.
New Install: In Stata, to install the flowchart package and its main dependencies in your system's ado filepath, and also flowchart's ancillary files into your current working directory, type or copy-and-paste the following commands into Stata:
. net install flowchart, from("https://raw.github.com/isaacdodd/flowchart/master/") replace
. flowchart setup
Updates: To later update your flowchart installation with the latest changes to the package, use the following command in Stata:
. flowchart setup, update
Uninstall: In order to complete the remove the package, flowchart can uninstall itself with the following command:
. flowchart setup, uninstall
After installation, type flowchart getstarted
for an introductory message to
the flowchart package.
Type help flowchart
for detailed instructions on how to get started. Study the
files flowchart_example1.do and flowchart_example2.do for very carefully
laid out examples of usage and a very detailed, thorough explanation of the
format.
The format for the code follows this typical example, which is available in flowchart_example2.do:
* INITIALIZE: Start a new datatool variable file.
flowchart init using "filename.data"
* WRITE ROWS: [center-block] , [left-block]
* Row with 2 blocks.
flowchart writerow(rownametest1): "lblock1_line1" 46 "This is one line, \\ of a block." ///
"lblock1_line2" 43 "This is another line, of a block" ///
"lblock1_line3" 3 "This is another line, of a block", ///
"rblock1_line1" 97 "This is one line, of a block." ///
"rblock1_line2" 33 "This is another line, of a block" ///
"rblock1_line3" 44 "This is another line, of a block"
* A '\\' in a description introduces a newline in LaTeX.
* Each of the 2 blocks can take several lines.
* Each line is a space-separated triplet of 3 fields: "variable_name" n_number "Descriptive text."
* Row with No center-block (a center-block appears on the left)
flowchart writerow(rownametest2): Flowchart_Blank, ///
"rblock1_line1" 97 "This is one line, of a block." ///
"rblock1_line2" 33 "This is another line, of a block" ///
"rblock1_line3" 44 "This is another line, of a block"
* Row with No left-block (a left-block appears on the right)
flowchart writerow(rownametest3): "lblock1_line1" 46 "This is one line, \\ of a block." ///
"lblock1_line2" 43 "This is another line, of a block" ///
"lblock1_line3" 3 "This is another line, of a block", Flowchart_Blank
* Row with No center-block and a Singleton Lead-Line in the left-block
flowchart writerow(rownametest4): Flowchart_Blank, "rblock1_line1" 97 "This is one line, \\ of a block."
* Row with Singleton Lead-Line in the center-block and No left-block
flowchart writerow(rownametest5): "lblock1_line1" 46 "This is one line, \\ of a block.", Flowchart_Blank
* CONNECTIONS: Use the block orientation to connect arrows to the appropriate blocks
flowchart connect rownametest1_center rownametest1_left
flowchart connect rownametest1_left rownametest2_left
flowchart connect rownametest1_center rownametest3_center
flowchart connect rownametest3_center rownametest5_center
flowchart connect rownametest2_left rownametest4_left
* FINALIZE: This writes the files and generates the 'tikzpicture'
flowchart finalize, template("figure.texdoc") output("figure.tikz")
After running this code, the LaTeX figure and manuscript files, which tie in the TikZ file and data file, can then be compiled using a LaTeX distribution and IDE editor with a previewer. Please see flowchart_example1.do and the list of resources for more information on LaTeX.
Under the hood, flowchart is an interesting package because it functions like a compiler and extends to some of the limitations of Stata's language features. It converts a simplistic Stata format into discrete tokens using Stata's low-level tokenizer, then through a hand-written parser (similar to a 2-token look-ahead, or LALR(2) parser) flowchart makes sense of each parsed token based on its logical sequence. Other internal functions transcompile the tokens into the correct PGF/TikZ code, looping through each block's lines. The connections go into a separate section.
The result is TikZ code that compiles into a flow
diagram. (To see some of this in
action, turn on the debugging log with the command flowchart debug on
and run
the code.) In the future, work can be done to simplify the overall structure to
to extend its functionality to further forms of diagrams and other languages,
replicating the underlying principles.
As they are identified, useful resources that produce visuals and diagrams using statistical methods will be listed in RESOURCES.md.
Contributions are greatly, greatly appreciated, and a major goal for this package is for flowchart to become a community-driven package.
Please submit bugs using GitHub since it is much more difficult to respond to issues or feature requests via email. All comments, feedback, or suggestions are greatly welcomed.
Please send your code via pull requests via the conventional means here on GitHub for review. Please feel free to make this project your own by contributing new code changes, new features, and fixes rather than forking the code into a separate project. Collaborations are greatly welcomed. Please click on CONTRIBUTING.md above for an explanation on how to get started.
If this package becomes often-used, or there are further feature requests, additional steps could incldue different styles of flowchart diagrams to match various Reporting Guidelines, functions to be able to create boxes from calculations, or a wider variety of box options. There are many directions this project could take should a user-base arise.
Credit to Ben Jann, whose texdoc package is a dependency in flowchart, and Morten Willert, whose example of a flowchart diagram in TikZ was studied and used heavily to generate similar flowcharts in this package.
By installing this package you agree to the GNU LGPL under which this package is licensed. Please see LICENSE.txt for the full license of the GNU LGPL 2007, which does allow for the incorporation of this package in proprietary software if necessary but without warranty. Note: 'Flowchart' comes with ABSOLUTELY NO WARRANTY; This is free software, and you are welcome to redistribute it under certain conditions. Copyright © 2017. Isaac M. E. Dodd. All rights reserved.