Windows batch file to find variable string in a html file -
i trying write windows batch file through specific html file looks (simplified):
<input name="pattern" value="*.var" type="text" /><img style="width: 16px; height: 16px; vertical-align:middle; cursor:pointer" onclick="this.parentnode.submit()" class="icon-go-next icon-sm" src="/static/474743c8/images/16x16/go-next.png" /></form></div><table class="filelist"><tr><td><img style="width: 16px; height: 16px; " class="icon-text icon-sm" src="/static/474743c8/images/16x16/text.png" /></td><td><a href="./address.var.varapplication-varapplication-varwebservice-05.05.07-snapshot.var">address.var.varapplication-varapplication-varwebservice-05.05.07-snapshot.var</a></td><td class="filesize">133.49 mb</td><td><a href="./address.var.varapplication-varapplication-varwebservice-05.05.07-snapshot.var/*fingerprint*/"><img style="width: 16px; height: 16px; " class="icon-fingerprint icon-sm" src="/static/474743c8/images/16x16/fingerprint.png" /></a> <a href="./address.var.varapplication-varapplication-varwebservice-05.05.07-snapshot.var/*view*/">view</a></td></tr><tr><td style="text-align:right;" colspan="3"><div style="margin-top: 1em;"><a href="./*.var/*zip*/target.zip"><img style="width: 16px; height: 16px; " class="icon-package icon-sm" src="/static/474743c8/images/16x16/package.png" />
and use build version (e.g. 05.05.07-snapshot - next time version format remain same) variable batch file. have tried findstr no success:
for /f "delims=" %%a in ('findstr /ic "webservice" a.html') set "line=%%a" set "line=%line:*webservice=%" /f "delims=" %%a in ("%line%") set string=%%a %%b in ("%line%") @ set "var=%%b" set build=%var:~-11,8% echo. %build%
when parsing structured markup, it's better treat hierarchical object flat text. not easier navigate hierarchy trying match strings tokens or regexp, object-oriented approach more resistant changes in formatting (whether code minified, beautified, line breaks introduced, whatever).
with in mind, suggest using queryselector select anchor tags children of table elements classname "filelist". use regex scrape version info anchor tag's href attribute.
@if (@codesection == @batch) @then @echo off & setlocal set "html=test.html" /f "delims=" %%i in ('cscript /nologo /e:jscript "%~f0" "%html%"') set "%%i" echo %build% goto :eof @end // end batch / begin jscript hybrid code var htmlfile = wsh.createobject('htmlfile'), fso = wsh.createobject('scripting.filesystemobject'), file = fso.opentextfile(wsh.arguments(0), 1), html = file.readall(); file.close(); htmlfile.write('<meta http-equiv="x-ua-compatible" content="ie=9" />' + html); var anchors = htmlfile.queryselectorall('table.filelist a'); (var = 0; < anchors.length; i++) { if (/webservice-((\d+\.)*\d.+)\.var$/i.test(anchors[i].href)) { wsh.echo('build=' + regexp.$1); wsh.quit(0); } }
what's cooler is, if html file you're scraping served web server, can use microsoft.xmlhttp
methods retrieve html without having rely on wget
or curl
or similar. requires few minor changes code above.
@if (@codesection == @batch) @then @echo off & setlocal set "url=http://www.domain.com/file.html" /f "delims=" %%i in ('cscript /nologo /e:jscript "%~f0" "%url%"') set "%%i" echo %build% goto :eof @end // end batch / begin jscript hybrid code var xhr = wsh.createobject('microsoft.xmlhttp'), htmlfile = wsh.createobject('htmlfile'); xhr.open('get', wsh.arguments(0), true); xhr.setrequestheader('user-agent', 'xmlhttp/1.0'); xhr.send(''); while (xhr.readystate != 4) wsh.sleep(50); htmlfile.write('<meta http-equiv="x-ua-compatible" content="ie=9" />' + xhr.responsetext); var anchors = htmlfile.queryselectorall('table.filelist a'); (var = 0; < anchors.length; i++) { if (/webservice-((\d+\.)*\d.+)\.var$/i.test(anchors[i].href)) { wsh.echo('build=' + regexp.$1); wsh.quit(0); } }
Comments
Post a Comment