Создайте функцию Azure для преобразования таблицы HTML в JSON. ⇐ C#
-
Anonymous
Создайте функцию Azure для преобразования таблицы HTML в JSON.
Two years ago I asked the question
How do I convert an HTML table into JSON in Logic Apps
@Skin provided the answer and it continues to work flawlessly.
Recently however, I started a project to extract table data from some webpages. I created a new function and copied the code from the from the first function. The code that @Skin provided was looking for span elements with an attribute of class="value" in order to identify and spit the data.
I was able to modify the code to work with one of the tables I need to convert by removing one of the variables and changing the line var values = xmlRow.SelectNodes(".//span[@class]"); to var values = xmlRow.SelectNodes(".//td");.
This worked for the table since it only had two values for each data set, Name and Start Date. Using a Compose Action I reworked the table to be very simple:
John Dodo 1 Some Date 1 John Dodo 2 Some Date 2 John Dodo 3 Some Date 3 The returned JSON is
{ "name": "John Dodo 1", "startDate": "Some Date 1" }, { "name": "John Dodo 2", "startDate": "Some Date 2" }, { "name": "John Dodo 3", "startDate": "Some Date 3" } ] However, When a table has three values like Name, Company, Address, and I add the appropriate variable to the code, I get a server error and I can't figure out why. I could try and modify the table to add span elements class="value" attributes to match the table that @Skin helped with, but I would like to understand why its won't work.
Here is the modified code that won't work:
#r "Newtonsoft.Json" using System.Net; using Microsoft.AspNetCore.Mvc; using Microsoft.Extensions.Primitives; using System.Collections.Generic; using Newtonsoft.Json; using System.Xml; public class TableData { public string PmRep { get; set; } public string PmComp { get; set; } public string PmAddr { get; set; } } public static async Task Run(HttpRequest req, ILogger log) { var outputTable = new List(); string requestBody = String.Empty; using (StreamReader streamReader = new StreamReader(req.Body)) { requestBody = await streamReader.ReadToEndAsync(); } dynamic data = JsonConvert.DeserializeObject(requestBody); string xmlString = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String((string)data?.Content));; var xmlDocument = new XmlDocument(); xmlDocument.LoadXml(xmlString); var xmlRows = xmlDocument.DocumentElement.SelectNodes("//tr"); foreach (XmlNode xmlRow in xmlRows) { var values = xmlRow.SelectNodes(".//td"); if (values.Count > 0) { var row = new TableData() { PmRep = values[0].InnerText, PmComp = values[1].InnerText PmAddr = values[2].InnerText }; outputTable.Add(row); } } return new OkObjectResult(outputTable); } and here is the sample table:
Name 1 Comp 1 Address 1 Name 2 Comp 2 Address 2 Name 3 Comp 3 Address 3
Источник: https://stackoverflow.com/questions/780 ... le-to-json
Two years ago I asked the question
How do I convert an HTML table into JSON in Logic Apps
@Skin provided the answer and it continues to work flawlessly.
Recently however, I started a project to extract table data from some webpages. I created a new function and copied the code from the from the first function. The code that @Skin provided was looking for span elements with an attribute of class="value" in order to identify and spit the data.
I was able to modify the code to work with one of the tables I need to convert by removing one of the variables and changing the line var values = xmlRow.SelectNodes(".//span[@class]"); to var values = xmlRow.SelectNodes(".//td");.
This worked for the table since it only had two values for each data set, Name and Start Date. Using a Compose Action I reworked the table to be very simple:
John Dodo 1 Some Date 1 John Dodo 2 Some Date 2 John Dodo 3 Some Date 3 The returned JSON is
{ "name": "John Dodo 1", "startDate": "Some Date 1" }, { "name": "John Dodo 2", "startDate": "Some Date 2" }, { "name": "John Dodo 3", "startDate": "Some Date 3" } ] However, When a table has three values like Name, Company, Address, and I add the appropriate variable to the code, I get a server error and I can't figure out why. I could try and modify the table to add span elements class="value" attributes to match the table that @Skin helped with, but I would like to understand why its won't work.
Here is the modified code that won't work:
#r "Newtonsoft.Json" using System.Net; using Microsoft.AspNetCore.Mvc; using Microsoft.Extensions.Primitives; using System.Collections.Generic; using Newtonsoft.Json; using System.Xml; public class TableData { public string PmRep { get; set; } public string PmComp { get; set; } public string PmAddr { get; set; } } public static async Task Run(HttpRequest req, ILogger log) { var outputTable = new List(); string requestBody = String.Empty; using (StreamReader streamReader = new StreamReader(req.Body)) { requestBody = await streamReader.ReadToEndAsync(); } dynamic data = JsonConvert.DeserializeObject(requestBody); string xmlString = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String((string)data?.Content));; var xmlDocument = new XmlDocument(); xmlDocument.LoadXml(xmlString); var xmlRows = xmlDocument.DocumentElement.SelectNodes("//tr"); foreach (XmlNode xmlRow in xmlRows) { var values = xmlRow.SelectNodes(".//td"); if (values.Count > 0) { var row = new TableData() { PmRep = values[0].InnerText, PmComp = values[1].InnerText PmAddr = values[2].InnerText }; outputTable.Add(row); } } return new OkObjectResult(outputTable); } and here is the sample table:
Name 1 Comp 1 Address 1 Name 2 Comp 2 Address 2 Name 3 Comp 3 Address 3
Источник: https://stackoverflow.com/questions/780 ... le-to-json
Мобильная версия