Search: [data_science]

GitHub - mwouts/jupytext: Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts

Can convert (and revert) jupyter notebooks to markdown and script files (i.e. plaintext files instead of singular json code files).

Could be useful for data tracking or converting between a jupyter-centric and a vim-centric data workflow.

python · data_science · markdown · commandline

March 11, 2022 at 17:49:34 GMT+1 * · permalink

·

https://github.com/mwouts/jupytext

·

Python for Data Analysis, 3E

Third edition of the famous data analysis learning book for pandas (and numpy) by the pandas author.

python · data_science · book · tutorial

February 14, 2022 at 17:49:51 GMT+1 * · permalink

·

https://wesmckinney.com/book/

·

Grabbing and storing personal data

karlicoss of the data liberation project HPI explains how to best store and access data moved from various points in the cloud/web/internet to your drives and why databases might not always be the best choice.

TLDR.
Save your grabbed data without any manipulation.
Let the manipulation happen every time you access/interpret the data.
If you have slices of data (mostly time frames), don't try to merge them on disk but save as extra files and merge on access/interpretation as well.
You can make use of databases for access caching since the last points generate some overhead for each access.

knowledge · data_science · quantified_self

October 21, 2021 at 09:42:36 GMT+2 * · permalink

·

https://beepb00p.xyz/unnecessary-db.html

·

python - How can I convert JSON to CSV? - Stack Overflow

easiest answer is with pandas as a library:

df = pd.read_json('inputfile.json')

df.to_csv('outputfile.csv', encoding='utf-8', index = False)

read_json converts a JSON string to a pandas object (either a series or dataframe).
to_csv can either return a string or write directly to a csv-file. See the docs for to_csv.

works best when json is an array of structured objects (unstructured data, see SO answer in link)

additional pandas to csv tips see this SO thread

python · JSON · data_science

August 26, 2021 at 10:37:36 GMT+2 * · permalink

·

https://stackoverflow.com/questions/1871524/how-can-i-convert-json-to-csv

·

How to develop yourself in data engineering

Also, a really generic template you could use is something like this: 1. Find a data blob, API, or web scrape a site for raw data you're interested in. 2. Figure out how to store that data. Do you need a relational database or maybe NoSQL? How will the records be stored and what does your data model look like? 3. Use analytics packages like numpy or something else, draw conclusions or find interesting themes about your data 4. Now do something with it! Maybe a front end to display it all. You can use Dash to build a quick and light visualization of your findings or something more full stack like a Django application or even Flask. Totally up to you.

reddit permalink

data_science

August 14, 2021 at 14:07:34 GMT+2 * · permalink

·

/shaare/UoKc9w

·

Level up your shell history with Loki and fzf | Opensource.com

An interesting use of loki to grab shell history, store it centrally, and then re-use it from the commandline to replace its traditional history functionality. Also includes a little tidbit on then integrating your shell history with e.g. Grafana.

linux · commandline · data_science

July 5, 2021 at 12:04:21 GMT+2 * · permalink

·

https://opensource.com/article/20/10/shell-history-loki-fzf

·

Data Layout and Schema Design Best Practices for InfluxDB | InfluxData

This post shows how to figure out the best data layout for InfluxDB v2, some schema design best practices, and a schema development example.

data_science · influxdb

May 15, 2021 at 16:40:03 GMT+2 * · permalink

·

https://www.influxdata.com/blog/data-layout-and-schema-design-best-practices-for-influxdb/

·

Wolfram Data Repository: Computable Access to Curated Data

Huge collection of computable, curated data from demographics to language, science & math, politics, social media. Many formats: numerical, time series, image, audio, geospatial. Can be exported as simple csv, or worked with in python notebooks on the page.

data_science · list

March 27, 2021 at 10:07:48 GMT+1 * · permalink

·

https://datarepository.wolframcloud.com/

·

100+ of the Best Free Data Sources For Your Next Project

A long list of data sources, divided by general topics, and of varying quality.

data_science · list

March 27, 2021 at 10:02:10 GMT+1 * · permalink

·

https://www.columnfivemedia.com/100-best-free-data-sources-infographic

·

Free Data Sources - Social Science Data Sources & Statistical Methods - Research Guides at Eastern Michigan University

Research Guides: Social Science Data Sources & Statistical Methods: Free Data Sources

A collection of free sources of various kinds of data, on recommendation-basis from EMU.

data_science · list

March 27, 2021 at 10:01:37 GMT+1 * · permalink

·

https://guides.emich.edu/data/free-data

·

15 Grafana vis you probably didn’t know was possible with these 3 plugins | by CrashLaker | Medium

Grafana is an amazing visualization tool used mainly by IT teams to monitor their infrastructure. As it’s open-source there’s huge contribution from the community on both datasource and panel making…

data_science · monitoring

March 23, 2021 at 11:02:00 GMT+1 * · permalink

·

https://crashlaker.medium.com/15-grafana-vis-you-probably-didnt-know-was-possible-with-these-3-plugins-4a43f6de75f6

·

Python tools for data visualization — PyViz 0.0.1 documentation

data_science · python

October 19, 2020 at 11:00:07 GMT+2 · permalink

·

https://pyviz.org/

·

Working with the Chronicling America API / Jason Heppler, PhD / Observable

Working with the Chronicling America API Chronicling America makes available American newspapers between 1789-1963. 🙌 This means we can access the data available by their API to explore the contents of the archive. First, we need our URL and search parameters. Using the inputs below, we can built our URL for a search term and the number of results we'd like back. We'll default to and , but feel free to adjust the inputs. We return our data, which comes back to us by providing the total number of

data_science · python · api · tutorial

October 10, 2020 at 08:41:23 GMT+2 · permalink

·

https://observablehq.com/@hepplerj/working-with-the-chronicling-america-api

·

GitHub - karoliskoncevicius/vim-sendtowindow: Small vim plugin implementing a send-to-window operator.

Can send text between any two adjacent vim windows.
Text can be defined by visual selection, motions and text objects.
Tries to position the cursor in a convenient place after each call.
Dot repeatable.

can be used for python or R for data science repl experience

vim · data_science · python

October 8, 2020 at 18:23:04 GMT+2 · permalink

·

https://github.com/karoliskoncevicius/vim-sendtowindow

·

Linear algebra learning recommendations

statistics · data_science · list

October 8, 2020 at 17:50:29 GMT+2 · permalink

·

https://www.reddit.com/r/statistics/comments/8s9ql7/because_ive_had_to_reference_my_linear_algebra/

·

Statistics recommendations for data analysis and linear calculus

statistics · data_science · list

October 8, 2020 at 17:49:38 GMT+2 * · permalink

·

https://www.reddit.com/r/statistics/comments/huq583/q_recommendations_for_books_and_courses_on_data/

·

GNUplot tips for nice looking charts from a CSV file - Raymii.org

advanced gnuplot functions

data_science · commandline

October 8, 2020 at 12:33:27 GMT+2 · permalink

·

https://raymii.org/s/tutorials/GNUplot_tips_for_nice_looking_charts_from_a_CSV_file.html

·

Plotting Data with gnuplot

simple gnuplot introduction

data_science · commandline

October 8, 2020 at 12:33:21 GMT+2 · permalink

·

https://www.cs.hmc.edu/~vrable/gnuplot/using-gnuplot.html

·

GitHub - metabase/metabase: The simplest, fastest way to get business intelligence and analytics to everyone in your company

The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum: - metabase

data_science · hosting

September 25, 2020 at 10:02:41 GMT+2 * · permalink

·

https://github.com/metabase/metabase

·

Welcome to Python 101! — Python 101 1.0 documentation

Extensive python introduction book leaning toward cli work and data ingestion (json, csv), as well as transformation (working with dates, numbering, etc)

python · data_science

September 25, 2020 at 08:53:07 GMT+2 · permalink

·

http://python101.pythonlibrary.org/

·