CasperJS和mediawiki自动导出xml-Java 学习之路

我正在尝试使用casperjs自动导出媒体wiki xml，因为我们无法通过托管它的机器进行访问 . 问题是下载xml作为post请求，响应是xml . 目前我有以下（另一张票有堆栈溢出带来这个代码）

casper.then(function(){
       var theFormRequest = this.page.evaluate(function() {
          var request = {}; 
          var formDom = document.forms[0];
          formDom.onsubmit = function() {
              //iterate the form fields
              var data = {};
              for(var i = 0; i < formDom.elements.length; i++) {
              data[formDom.elements[i].name] = formDom.elements[i].value;
            }
            request.action = formDom.action;
            request.data = data;
            return false; //Stop form submission
        }

        var link = $(".visualClear").submit();      
        return request; 
    }); 


        this.echo("DOWNLOADING  " + theFormRequest.action + "  " + theFormRequest.data);
        casper.download(theFormRequest.action, "downloaded_file.xml", "POST", theFormRequest.data);
    });

我使用了resource.received事件，但这只是给了我响应的元数据而不是实际数据 . 目前下载的是html页面，而不是xml，我可以确认单击该按钮，就好像我删除了formDom.onSubmit，resource.received显示返回的content.type是XML .

虽然它在我的mediawiki中使用的类别不同，但这里是mediawiki页面完全相同的事情https://www.mediawiki.org/wiki/Special:Export

谢谢

编辑这是通过浏览器完成的响应

Accept-Ranges:bytes
Age:0
Connection:keep-alive
Content-disposition:attachment;filename=file-20161010165904.xml
Content-Length:172717
Content-Type:application/xml; charset=utf-8
Date:Mon, 10 Oct 2016 16:59:04 GMT
Server:nginx
Via:1.1 varnish
X-Content-Type-Options:nosniff
X-Powered-By:HHVM/3.15.1
X-Varnish:1482412076

1 回答

我会怎么做：

casper.start();

casper.open('http://url_to_mediawiki', {
        method: 'GET',
        headers: {
            'Content-Type': 'application/xml; charset=utf-8',
        },
        encoding: 'utf8'
    }
})

casper.then(function() {
    this.echo(this.getPageContent());
})

getPageContent() 是内容不可知的，它很灵活 . 它只是呈现它找到的东西 . 检查casper#getpagecontent和casper#open

回复于 2024-05-03T02:43:09+08:00

CasperJS和mediawiki自动导出xml

1 回答

相关问题