1,741,033 research outputs found
Portrait of Lieutenant Benjamin J. Gaston, 1861
This is a portrait of Lieutenant Benjamin J. Gaston taken in 1861
Un recuerdo : polka : [pour piano] / por Benjamin J. Baquerizo ; [ill. par] A. Barbizet
Titre uniforme : Baquerizo, Benjamin J. (18..-19.. ; compositeur). Compositeur. [Un recuerdo. Piano]Polkas (piano) -- +* 1800......- 1899......+:19e siècle:Piano, Musique de -- +* 1800......- 1899......+:19e siècle
TaPvex: A Tagged and Phrased word2vec Model
TaPvex: A Tagged and Phrased word2vec Model
Benjamin J. Radford
September 29, 2017
Summary
TaPvex is a trained word2vec model of part-of-speech-tagged and named-entity-tagged words and phrases. The model was trained on a large corpus of English language news text from the early 2010s. Words have been tagged using Stanford CoreNLP to include named entities (NER) and Penn Treebank parts-of-speech (POS). Tagged words have been concatenated into n-gram phrases.
The model contains 1.17 million unique words and phrases. Word vectors are of size 150.
Use
All three files (TaPvex, TaPvex.syn0.npy, TaPvex.syn1.npy) must be located in the same directory. The file can be opened with the gensim Python module using:
from gensim.models import Word2Vec
model = Word2Vec.load("/path/to/model/TaPvex")
Examples
Tokens are of the form:
[WORD]:[NER]:[POS]
Phrases are of the form:
[WORD]:[NER]:[POS]_[WORD]:[NER]:[POS]
Example tokens include:
BUSH:O:NN
BUSHES:O:NNS
BUSH:PERSON:NNP
GEORGE:PERSON:NNP_BUSH:PERSON:NNP
GEORGE:PERSON:NNP_W:PERSON:NNP_BUSH:PERSON:NNP
NEW:O:JJ
NEW:LOCATION:NNP_YORK:LOCATION:NNP
NEW:ORGANIZATION:NNP_YORK:ORGANIZATION:NNP_TIMES:ORGANIZATION:NNP
</pre
Headlines of War data set
The Headlines of War dataset is an event detection and event linking (i.e. event coreference resolution) evaluation dataset. It includes three event classes: militarized interstate dispute incidents (MIDI), militarized interstate disputes (MID), and a negative class. The MIDI class is nested below the MID class: a MID comprises a collection of one or more MIDI events.
The first iteration of Headlines of War was presented at the AESPEN workshop during LREC 2020.
Please cite the following paper:
Radford, Benjamin J. (2020). "Seeing the Forest for the Trees: Detection and Cross-Document Coreference Resolution of Militarized Interstate Disputes." In The Proceedings of the Workshop on Automated Extraction of Socio-Political Events from News (AESPEN) at LREC 2020
TaPvex: A Tagged and Phrased word2vec Model
TaPvex: A Tagged and Phrased word2vec Model
Benjamin J. Radford
September 29, 2017
Summary
TaPvex is a trained word2vec model of part-of-speech-tagged and named-entity-tagged words and phrases. The model was trained on a large corpus of English language news text from the early 2010s. Words have been tagged using Stanford CoreNLP to include named entities (NER) and Penn Treebank parts-of-speech (POS). Tagged words have been concatenated into n-gram phrases.
The model contains 1.17 million unique words and phrases. Word vectors are of size 150.
Use
All three files (TaPvex, TaPvex.syn0.npy, TaPvex.syn1.npy) must be located in the same directory. The file can be opened with the gensim Python module using:
from gensim.models import Word2Vec
model = Word2Vec.load("/path/to/model/TaPvex")
Examples
Tokens are of the form:
[WORD]:[NER]:[POS]
Phrases are of the form:
[WORD]:[NER]:[POS]_[WORD]:[NER]:[POS]
Example tokens include:
BUSH:O:NN
BUSHES:O:NNS
BUSH:PERSON:NNP
GEORGE:PERSON:NNP_BUSH:PERSON:NNP
GEORGE:PERSON:NNP_W:PERSON:NNP_BUSH:PERSON:NNP
NEW:O:JJ
NEW:LOCATION:NNP_YORK:LOCATION:NNP
NEW:ORGANIZATION:NNP_YORK:ORGANIZATION:NNP_TIMES:ORGANIZATION:NNP
</pre
Un pensamiento : polka-mazurka : op. 14 / compuesta para el piano por Benjamin J. Baquerizo ; [ill. par] A. Barbizet
Titre uniforme : Baquerizo, Benjamin J. (18..-19.. ; compositeur). Compositeur. [Un pensamiento. Piano]Polkas-mazurkas (piano) -- +* 1800......- 1899......+:19e siècle:Piano, Musique de -- +* 1800......- 1899......+:19e siècle
Smith, Benjamin J.
Carte de Visite of Benjamin J. Smith, 11th Maine Infantry, Company C, also served in Maine Legislature 1878; From the MacDonald Collectionhttps://digitalmaine.com/arc_civilwarportraits/2733/thumbnail.jp
Benjamin J. Kern
Benjamin J. Kern (1818-1849) was a member of John C. Fremont\u27s 4th expedition. He was killed by American Indians
Smith, Benjamin J.
Carte de Visite of Benjamin J. Smith, 11th Maine Infantry, Company C, also served in Maine Legislature 1878; From the MacDonald Collectionhttps://digitalmaine.com/arc_civilwarportraits/2733/thumbnail.jp
- …
