java - Getting original text after using stanford NLP parser -


hello people of internet,

we're having following problem stanford nlp api: have string want transform list of sentences. first, used string sentencestring = sentence.listtostring(sentence); listtostring not return original text because of tokenization. tried use listtooriginaltextstring in following way:

private static list<string> getsentences(string text) {         reader reader = new stringreader(text);         documentpreprocessor dp = new documentpreprocessor(reader);         list<string> sentencelist = new arraylist<string>();          (list<hasword> sentence : dp) {             string sentencestring = sentence.listtooriginaltextstring(sentence);             sentencelist.add(sentencestring.tostring());         }          return sentencelist;     } 

this not work. apparently have set attribute " invertible " true don't know how to. how can this?

in general, how use listtooriginaltextstring properly? preparations need?

sincerely, khayet

if understand correctly, want mapping of tokens original input text after tokenization. can this;

        //split via ptbtokenizer (ptblexer)         list<corelabel> tokens = ptbtokenizer.corelabelfactory().gettokenizer(new stringreader(text)).tokenize();          //do processing using stanford sentence splitter (wordtosentenceprocessor)         wordtosentenceprocessor processor = new wordtosentenceprocessor();         list<list<corelabel>> splitsentences = processor.process(tokens);          //for each sentence         (list<corelabel> s : splitsentences) {                              //for each word             (corelabel token : s) {                 //here can token value , position like;                 //token.value(), token.beginposition(), token.endposition()             }              } 

Comments

Popular posts from this blog

magento2 - Magento 2 admin grid add filter to collection -

Android volley - avoid multiple requests of the same kind to the server? -

Combining PHP Registration and Login into one class with multiple functions in one PHP file -