вот шаг, который я делал, пока бот не получил ответ в базу данных
Я использовал некоторые инструменты НЛП https://github.com/angeloskath/php-nlp-tools/
Код: Выделить всё
Input Cleaning:
Clean and preprocess the input question (e.g., remove special characters, convert to lowercase).
Tokenization:
Split the input into individual tokens (words).
Stop Word Removal:
Remove common stop words that don't contribute to the meaning.
Spelling Correction:
Correct any spelling mistakes in the tokens using a fuzzy matching algorithm.
Synonym Mapping:
Map tokens to their synonyms to handle variations in phrasing.
Stemming:
Reduce tokens to their base or root form.
Keyword Matching:
Check for direct keyword matches with predefined questions.
POS Tagging:
Perform Part-of-Speech (POS) tagging on the input to identify the grammatical structure.
NER Tagging:
Perform Named Entity Recognition (NER) tagging to identify entities like names and places.
TF-IDF Vectorization:
Convert the preprocessed input and stored questions into TF-IDF vectors.
Cosine Similarity Calculation:
Compute the cosine similarity between the input vector and stored question vectors.
Best Match Selection:
Select the stored question with the highest cosine similarity score.
Fetch Answer:
Retrieve the corresponding answer from the database for the best-matched question.
Response Generation:
Generate and return the response with the matched question, answer, and tagging results.
где доктор Мурат Кидс
http://localhost/AIGOOD/get_question.php?userMsg =где%20is%20prof%20murat%20kids
Код: Выделить всё
"answer": "Professor Murat OZGOREN office is located on the ground floor of the faculty of engineering\r\n\r\n\r\n\r\n\r\nhttps://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJ31UK6KN0fn8DOEGlrFQjgEItLJv3FlNg8ImWfaG9IZiSGEu32KqqwwROyo4mRwcWDR_ndWl9SIhAMHvKMkCGonG0fC8BHcGCYh41PNxN27db0oBOfvrtntwDjPhA62t0yfiPrcmxqCEbS49o4nR__PgJkCbB-XZx152VU6WGV73FHYhc9pVCFLpUAaw/s320/ezgif-3-For%20the%20orange.gif",
"matchedQuestion": "where is Professor Murat OZGOREN office located?",
"matchedSearchWords": [
" murat ozgoren office",
" dr murat ozgoren office",
" prof murat office",
" professor murat ozgoren office",
" dr murat office ",
"where professor murat ozgoren office located",
"where is dr murat ozgoren office located?"
],
"posTaggingResult": [
"where/WRB\r\n",
"is/VBZ\r\n",
"prof/NN",
"murat/NNP\r\n",
"kids/NNS"
],
"nerTaggingResult": [
"where/O",
"is/O",
"prof/O",
"murat/PERSON",
"kids/O"
]
}
что я могу сделать
Я ожидаю, что бот даст ответ со 100% точностью
Подробнее здесь: https://stackoverflow.com/questions/787 ... g-of-input
Мобильная версия