

作   者:(英)里斯 著





  自然语言处理(NLP)是应用开发中的重要领域之 一,其与解决当代问题的相关性将与日俱增。对于它 通过NLP任务支持实现的自然语言可访问应用的需求 已有显*增长。里斯编写的《Java自然语言处理(影 印版)(英文版)》将运用诸如全文检索、合适名称识 别、聚类、标签、信息抽取和摘要等手段展示如何自 动组织文本。本书介绍了各种NLP概念,即便你没有 任何统计学自然语言处理背景也能理解。  自然语言处理(NLP)是应用开发中的重要领域之 一,其与解决当代问题的相关性将与日俱增。对于它 通过NLP任务支持实现的自然语言可访问应用的需求 已有显*增长。里斯编写的《Java自然语言处理(影 印版)(英文版)》将运用诸如全文检索、合适名称识 别、聚类、标签、信息抽取和摘要等手段展示如何自 动组织文本。本书介绍了各种NLP概念,即便你没有 任何统计学自然语言处理背景也能理解。


Chapter 1: Introduction to NLP
 What is NLP?
 Why use NLP?
 Why is NLP so hard?
 Survey of NLP tools
 Apache OpenNLP
 Stanford NLP
 Overview of text processing tasks
 Finding parts of text
 Finding sentences
 Finding people and things
 Detecting Parts of Speech
 Classifying text and documents
 Extracting relationships
 Using combined approaches
 Understanding NLP models
 Identifying the task
 Selecting a model
 Building and training the model
 Verifying the model
 Using the model
 Preparing data
Chapter 2: Finding Parts of Text
 Understanding the parts of text
 What is tokenization?
 Uses of tokenizers
 Simple Java tokenizers
 Using the Scanner class
 Specifying the delimiter
 Using the split method
 Using the Breaklterator class
 Using the StreamTokenizer class
 Using the StringTokenizer class
 Performance considerations with java core tokenization
 NLP tokenizer APIs
 Using the OpenNLPTokenizer class
 Using the SimpleTokenizer class
 Using the WhitespaceTokenizer class
 Using the TokenizerME class
 Using the Stanford tokenizer
 Using the PTBTokenizer class
 Using the DocumentPreprocessor class
 Using a pipeline
 Using LingPipe tokenizers
 Training a tokenizer to find parts of text
 Comparing tokenizers
 Understanding normalization
 Converting to lowercase
 Removing stopwords
 Creating a StopWords class
 Using LingPipe to remove stopwords
 Using stemming
 Using the Porter Stemmer
 Stemming with LingPipe
 Using lemmatization
 Using the StanfordLemmatizer class
 Using lemmatization in OpenNLP
 Normalizing using a pipeline
Chapter 3: Finding Sentences
 The SBD process
 What makes SBD difficult?
 Understanding SBD rules of LingPipe's
 HeuristicSentenceModel class
 Simple Java SBDs
 Using regular expressions
 Using the Breaklterator class
 Using NLP APIs
 Using OpenNLP
 Using the SentenceDetectorME class
 Using the sentPosDetect method
 Using the Stanford API
 Using the PTBTokenizer class
 Using the DocumentPreprocessor class
 Using the StanfordCoreNLP class
 Using LingPipe
 Using the IndoEuropeanSentenceModel class
 Using the SentenceChunker class
 Using the MedlineSentenceModel class
 Training a Sentence Detector model
 Using the Trained model
 Evaluating the model using the SentenceDetectorEvaluator class
Chapter 4: Finding People and Things
 Why NER is difficult?
 Techniques for name recognition
 Lists and regular expressions
 Statistical classifiers
 Using regular expressions for NER
 Using Java's regular expressions to find entities
 Using LingPipe's RegExChunker class
 Using NLP APIs
 Using OpenNLP for NER
 Determining the accuracy of the entity
 Using other entity types
 Processing multiple entity types
 Using the Stanford API for NER
 Using LingPipe for NER
 Using LingPipe's name entity models
 Using the ExactDictionaryChunker class
 Training a model
 Evaluating a model
Chapter 5: Detecting Parts of Speech
 The tagging process
 Importance of POS taggers
 What makes POS difficult?
 Using the NLP APIs
 Using OpenNLP POS taggers
 Using the OpenNLP POSTaggerME class for POS taggers
 Using OpenNLP chunking
 Using the POSDictionary class
 Using Stanford POS taggers
 Using Stanford MaxentTagger
 Using the MaxentTagger class to tag textese
 Using Stanford pipeline to perform tagging
 Using LingPipe POS taggers
 Using the HmmDecoder class with BestFirst tags
 Using the HmmDecoder class with NBest tags
 Determining tag confidence with the HmmDecoder class
 Training the OpenNLP POSModel
Chapter 6: Classi ify_~g_ Texts and Documents
 How classification is used
 Understanding sentiment analysis
 Text classifying techniques
 Using APIs to classify text
 Using OpenNLP
 Training an OpenNLP classification model
 Using DocumentCategorizerME to classify text
 Using Stanford API
 Using the ColumnDataClassifier class for classification
 Using the Stanford pipeline to perform sentiment analysis
 Using LingPipe to classify text
 Training text using the Classified class
 Using other training categories
 Classifying text using LingPipe
 Sentiment analysis using LingPipe
 Language identification using LingPipe
Chapter 7: Using Parser to Extract Relationships
 Relationship types
 Understanding parse trees
 Using extracted relationships
 Extracting relationships
 Using NLP APIs
 Using OpenNLP
 Using the Stanford API
 Using the LexicalizedParser class
 Using the TreePrint class
 Finding word dependencies using the GrammaticalStructure class
 Finding coreference resolution entities
 Extracting relationships for a question-answer system
 Finding the word dependencies
 Determining the question type
 Searching for the answer
Chapter 8: Combined Approaches
 Preparing data
 Using Boilerpipe to extract text from HTML
 Using POI to extract text from Word documents
 Using PDFBox to extract text from PDF documents
 Using the Stanford pipeline
 Using multiple cores with the Stanford pipeline
 Creating a pipeline to search text



页码 勘误内容 提交人 修订印次

    • 名称
    • 类型
    • 大小

    光盘服务联系方式: 020-38250260    客服QQ:4006604884







    用户发送的提问,这种方式就需要有位在线客服来回答用户的问题,这种 就属于对话式的,问题是这种提问是否需要用户登录才能提问

    Video Player
    Audio Player
    pdf Player
    Current View


    some pictures


    东野圭吾 (作者), 李盈春 (译者)

    loading icon