-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document pandas.DataFrame.to_stata data_label #13535
Comments
The release note adding it is here
So it looks like a single label for the entire dataset, not one per column. Though I could be wrong as I have only used stata once. cc @bashtage who added this. |
Thanks Tom, yes that seems to be the case unfortunately. I would really like to be able to add labels for individual variables though. Should I open a feature request for that? |
Only dataset labels are implemented, not per variable labels. The dataset label in the dta file format is:
|
@frehoy This would need a feature request. Should also document these two inputs. |
It seems that is is almost implemented here. See and https://github.com/pydata/pandas/blob/master/pandas/io/stata.py#L2134 As you can see, this always passes the default Mostly needs an external interface and a tiny amount of wiring up. And testing, esp that the labels can be read into Stata. |
Add support for writing variable labels Fix documentation for to_stata Clean up function name to improve readability closes pandas-dev#13536 closes pandas-dev#13535
I work with Pandas and Stata and found the DataFrame.to_stata() method very valuable. I would like to be able to assign variable labels in my .dta files but the data_label parameter of the DataFrame.to_stata() method is not documented so I do not know in which format to supply my variable labels.
I tried a dictionary of the form {'df column name' : 'wanted label'} but that returns
TypeError: unhashable type: 'slice'
Here is the page in the documentation I am referring to: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_stata.html#pandas.DataFrame.to_stata
If it could be updated to provide a working example of data_label I would be eternally grateful. This is my first issue on github, hope I specified it correctly, feel free to point out if I messed up the format somehow.
Thanks!
Code Sample, a copy-pastable example if possible
import pandas as pd
d = {'one' : [1., 2., 3., 4.]}
df = pd.DataFrame(d)
labdict = {'one': 'foo'}
df.to_stata('test.dta', write_index=False, data_label=labdict)
Expected Output
output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
pandas: 0.17.1
nose: 1.3.7
pip: 8.1.1
setuptools: 19.6.2
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
IPython: 4.0.3
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.6
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.5.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: None
Jinja2: 2.8
The text was updated successfully, but these errors were encountered: