-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack/Melt over multiple sets of variables #1839
Comments
I have written for myself a function that melts over multiple variables but I am not very good at optimizing code.
Example:
The functionality is similar to I can try to write some test, documentation and improve the readability of the function before I submit a PR but feel free to use it as a clarification of my feature request in case you are already working on a similar function. |
@nalimilan recently mentioned an intention to work on Regarding your code |
I am closing this in favor of #3237 (to have a single place to discuss all related issues) |
It would be a useful feature if
stack
andmelt
could be implemented over multiple sets of variables. For example from this:To this:
This is implemented in R and Stata via
pivot_longer
andreshape long
respectively.@bkamins suggested the following for a simpler name:
For a more general application, users need to rename the columns with a specific pattern for the function to split the name and the value such as
varA_2018
.One idea is to make use of Regex e.g. :
The function would
melt
the DataFrame as many times as the specified set of columns over the specified ID (in the above example 2), rename the column of the value tom[1]
(in this case:varA
and:varB
), replace the name of the variable to:Year
and the values of the column:Year
tom[2]
and finally merge the intermediate DataFrames over:ID
and:Year
.The second method is to specify the name of the new variable as an argument in
melt
:Using a regular expression the function will
melt
the DataFrame using the columns that start with "varA" and "varB" respectively, rename and replace the value as above and then merge the DataFrames over:ID
and:Year
.I am not very good at writing functions for packages, especially regular expressions, but if you need any help to implement this I would be glad to help the best that I can.
The text was updated successfully, but these errors were encountered: