Basic usage of the chaotic_neutral package

This example notebook shows how to use the basic astro-ph-GA-23May2021 model trained on the ~30k most recent ArXiv astro-ph.GA papers up to May 23, 2021.

A live version of this notebook can be found here on Google Colab.

Users can query the model for similar papers using either - input_type: arxiv_id or keywords for the doc_id field. In the latter case, input a list of keyword strings. - return_n: controls how many results to return.

Additional arguments include: - show_authors (default = False): set to True to show author list - show_summary (default = False): set to True to show a 1-2 sentence abstract summary generated using the summa package.

[1]:
import chaotic_neural as cn

If you are running this notebook on your machine from a cloned version of the repository and are using the pre-trained model that comes with chaotic_neutral, please make sure to 1. either run this notebook in the same /docs/tutorial directory you find it in, or 2. change the directory cn_dir in the cell below to match the directory that chaotic_neural is installed in

[2]:
model_data = cn.load_trained_doc2vec_model('astro-ph-GA-23May2021', cn_dir = '../../chaotic_neural/')
model, all_titles, all_abstracts, all_authors, train_corpus, test_corpus = model_data
[3]:
sims = cn.list_similar_papers(model_data, doc_id = 1903.10457, input_type='arxiv_id')
ArXiv id:  1903.10457
Title: Learning the Relationship between Galaxies Spectra and their Star
  Formation Histories using Convolutional Neural Networks and Cosmological
  Simulations
-----------------------------
Most similar/relevant papers:
-----------------------------
0 Learning the Relationship between Galaxies Spectra and their Star
  Formation Histories using Convolutional Neural Networks and Cosmological
  Simulations  (Corrcoef: 0.99 )
1 MCSED: A flexible spectral energy distribution fitting code and its
  application to $z \sim 2$ emission-line galaxies  (Corrcoef: 0.73 )
2 Augmenting machine learning photometric redshifts with Gaussian mixture
  models  (Corrcoef: 0.73 )
3 MAGPHYS+photo-z: Constraining the Physical Properties of Galaxies with
  Unknown Redshifts  (Corrcoef: 0.72 )
4 MOSFIRE Spectroscopy of Quiescent Galaxies at 1.5 < z < 2.5. II - Star
  Formation Histories and Galaxy Quenching  (Corrcoef: 0.71 )
5 Stellar Populations of over one thousand $z\sim0.8$ Galaxies from
  LEGA-C: Ages and Star Formation Histories from D$_n$4000 and H$δ$  (Corrcoef: 0.70 )
6 Comparison of Observed Galaxy Properties with Semianalytic Model
  Predictions using Machine Learning  (Corrcoef: 0.70 )
7 Simultaneous analysis of SDSS spectra and GALEX photometry with
  STARLIGHT: Method and early results  (Corrcoef: 0.70 )
8 Swift/UVOT+MaNGA (SwiM) Value-added Catalog  (Corrcoef: 0.68 )
9 Evaluating hydrodynamical simulations with green valley galaxies  (Corrcoef: 0.68 )
[4]:
sims = cn.list_similar_papers(model_data, doc_id = ['quenching','galaxy'],
                           input_type='keywords',
                           return_n=10, show_authors = True, show_summary=True)
Keyword(s):  ['quenching', 'galaxy']
multi-keyword
-----------------------------
Most similar/relevant papers:
-----------------------------
0 The cumulative star-formation histories of dwarf galaxies with TNG50. I:
  Environment-driven diversity and connection to quenching  (Corrcoef: 0.58 )
Authors:------
[{'name': 'Gandhali D. Joshi'}, {'name': 'Annalisa Pillepich'}, {'name': 'Dylan Nelson'}, {'name': 'Elad Zinger'}, {'name': 'Federico Marinacci'}, {'name': 'Volker Springel'}, {'name': 'Mark Vogelsberger'}, {'name': 'Lars Hernquist'}]
Summary:------
The key factors determining the dwarfs' SFHs are their status as central or satellite and their stellar mass, with centrals and more massive dwarfs assembling their stellar mass at later times on average compared to satellites and lower mass dwarfs.
TNG50 predicts a large diversity in SFHs for both centrals and satellites, so that the stacked cumulative SFHs are representative of the TNG50 dwarf populations only in an average sense and individual dwarfs can have significantly different cumulative SFHs. Satellite dwarfs with the highest stellar mass to host mass ratios have the latest stellar mass assembly.

1 YZiCS: Preprocessing of dark halos in the hydrodynamic zoom-in
  simulation of clusters  (Corrcoef: 0.57 )
Authors:------
[{'name': 'San Han'}, {'name': 'Rory Smith'}, {'name': 'Hoseung Choi'}, {'name': 'Luca Cortese'}, {'name': 'Barbara Catinella'}, {'name': 'Emanuele Contini'}, {'name': 'Sukyoung K. Yi'}]
Summary:------
We find ~48% of today's cluster members were once satellites of other hosts.
From a sample of heavily tidally stripped members in clusters today, nearly three quarters were previously in a host.

2 The infall of dwarf satellite galaxies are influenced by their host's
  massive accretions  (Corrcoef: 0.56 )
Authors:------
[{'name': "Richard D'Souza"}, {'name': 'Eric F. Bell'}]
Summary:------
Using zoom-in dark matter-only simulations of MW-mass haloes and concentrating on subhaloes that are thought to be capable of hosting dwarf galaxies, we demonstrate that the infall of a massive progenitor is accompanied with the accretion and destruction of a large number of subhaloes.

3 The breakBRD Breakdown: Using IllustrisTNG to Track the Quenching of an
  Observationally-Motivated Sample of Centrally Star-Forming Galaxies  (Corrcoef: 0.56 )
Authors:------
[{'name': 'Claire Kopenhafer'}, {'name': 'Tjitske K. Starkenburg'}, {'name': 'Stephanie Tonnesen'}, {'name': 'Sarah Tuttle'}]
Summary:------
However, the central, non-splashback breakBRD galaxies show similar environments, black hole masses, and merger rates, indicating that there is not a single formation trigger for inner star formation and outer quenching.

4 The ACS LCID Project: On the origin of dwarf galaxy types: a
  manifestation of the halo assembly bias?  (Corrcoef: 0.56 )
Authors:------
[{'name': 'C. Gallart'}, {'name': 'M. Monelli'}, {'name': 'L. Mayer'}, {'name': 'A. Aparicio'}, {'name': 'G. Battaglia'}, {'name': 'E. J. Bernard'}, {'name': 'S. Cassisi'}, {'name': 'A. A. Cole'}, {'name': 'A. E. Dolphin'}, {'name': 'I. Drozdovsky'}, {'name': 'S. L. HIdalgo'}, {'name': 'J. F. Navarro'}, {'name': 'S. Salvadori'}, {'name': 'E. D. Skillman'}, {'name': 'P. B. Stetson'}, {'name': 'D. R. Weisz'}]
Summary:------
We argue that these galaxies can be assigned to two basic types: fast dwarfs that started their evolution with a dominant and short star formation event, and slow dwarfs that formed a small fraction of their stars early and have continued forming stars until the present time (or almost).

5 Mufasa:The strength and evolution of galaxy conformity in various
  tracers  (Corrcoef: 0.54 )
Authors:------
[{'name': 'Mika Rafieferantsoa'}, {'name': 'Romeel Davé'}]
Summary:------
Mufasa produces conformity in observed properties such as colour, sSFR, and HI content; i.e neighbouring galaxies have similar properties.
We show that low-mass and non-quenched haloes have weak conformity ($S(R)\leq 0.5$) extending to large projected radii $R$ in all properties, while high-mass and quenched haloes have strong conformity ($S(R)\sim 1$) that diminishes rapidly with $R$ and disappears at $R\geq 1$ Mpc.

6 Star Formation in Isolated Dwarf Galaxies Hosting Tidal Debris:
  Extending the Dwarf-Dwarf Merger Sequence  (Corrcoef: 0.54 )
Authors:------
[{'name': 'Erin Kado-Fong'}, {'name': 'Jenny E. Greene'}, {'name': 'Johnny P. Greco'}, {'name': 'Rachael Beaton'}, {'name': 'Andy D. Goulding'}, {'name': 'Sean D. Johnson'}, {'name': 'Yutaka Komiyama'}]
Summary:------
These findings extend the observed dwarf-dwarf merger sequence with a significant sample of dwarf galaxies, indicating that star formation triggered in mergers between dwarf galaxies continues after coalescence.

7 WALLABY Pilot Survey: First Look at the Hydra I Cluster and Ram Pressure
  Stripping of ESO 501-G075  (Corrcoef: 0.54 )
Authors:------
[{'name': 'T. N. Reynolds'}, {'name': 'T. Westmeier'}, {'name': 'A. Elagali'}, {'name': 'B. Catinella'}, {'name': 'L. Cortese'}, {'name': 'N. Deg'}, {'name': 'B. -Q. For'}, {'name': 'P. Kamphuis'}, {'name': 'D. Kleiner'}, {'name': 'B. S. Koribalski'}, {'name': 'K. Lee-Waddell'}, {'name': 'S. -H. Oh'}, {'name': 'J. Rhee'}, {'name': 'P. Serra'}, {'name': 'K. Spekkens'}, {'name': 'L. Staveley-Smith'}, {'name': 'A. R. H. Stevens'}, {'name': 'E. N. Taylor'}, {'name': 'J. Wang'}, {'name': 'O. I. Wong'}]
Summary:------
We conclude that, as ESO 501-G075 has a typical HI mass compared to similar galaxies in the field and its morphology is compatible with a ram pressure scenario, ESO 501-G075 is likely recently infalling into the cluster and in the early stages of experiencing ram pressure.

8 SDSS-IV MaNGA: The Formation Sequence of S0 Galaxies  (Corrcoef: 0.53 )
Authors:------
[{'name': 'Amelia Fraser-McKelvie'}, {'name': 'Alfonso Aragón-Salamanca'}, {'name': 'Michael Merrifield'}, {'name': 'Martha Tabor'}, {'name': 'Mariangela Bernardi'}, {'name': 'Niv Drory'}, {'name': 'Taniya Parikh'}, {'name': 'Maria Argudo-Fernández'}]
Summary:------
In order to determine the conditions in which each scenario dominates, we derive stellar populations of both the bulge and disk regions of 279 lenticular galaxies in the MaNGA survey.
Old and metal-rich bulges and disks belong to massive galaxies, and young and metal-poor bulges and disks are hosted by low-mass galaxies.

9 Secondary Infall in the Seyfert's Sextet: A Plausible Way Out of the
  Short Crossing Time Paradox  (Corrcoef: 0.53 )
Authors:------
[{'name': 'Omar López-Cruz'}, {'name': 'Héctor Javier Ibarra-Medel'}, {'name': 'Sebastián F. Sánchez'}, {'name': 'Mark Birkinshaw'}, {'name': 'Christopher Añorve'}, {'name': 'Jorge K. Barrera-Ballesteros'}, {'name': 'Jesús Falcon-Barroso'}, {'name': 'Wayne A. Barkhouse'}, {'name': 'Juan P. Torres-Papaqui'}]
Summary:------
We suggest that after the first turn-around, initially gas-rich galaxies crossed for the first time, consuming most of their gas.
Therefore, we suggest that SS galaxies have survived one crossing during a Hubble time.

[ ]: