Spring Batch Reader性能问题

我们有一个Spring批处理代码,它从本地硬盘读取数据,创建一个Object并发送到Apache Solr进行索引 . 与使用独立运行代码时相比,使用JBoss时我们看到的性能非常差 .

使用YourKit分析器,我们看到消耗的时间是组件读取数据和创建solr对象所花费的时间 .

对于我们记录的时间,对于1000条记录,独立需要7秒,在JBoss作为批处理过程,需要7分钟 . / ** *

* Name : AccountingReader.java
 * Description: This class is used to read accounting source data from the flat files.
 * References : None
 */ 

 public class AccountingReader implements ItemReader {
    /logger/
    private static final Logger logger = LoggerFactory.getLogger(AccountingReader.class);
    private final ExecutionContext executionContext = new ExecutionContext();
    private Resource resource;
    private FlatFileItemReader reader; 

/**
 * Gets the input Resource[file] to read the data from
 *
 * @return the Resource[file] with input data
 *
 */
public Resource getResource() {
    return resource;
}

/**
 * Set the input Resource[file] to read the data from
 *
 * @param resource the Resource[file] with input data
 */
public void setResource(final Resource resource) {
    this.resource = resource;
}

/**
 * Method which maps read lines to Accounting object
 *
 * @return LineMapper the <code>LineMapper</code> object
 */
private LineMapper<Accounting> createAccountingLineMapper() {

    final DefaultLineMapper<Accounting> accountingLineMapper = new DefaultLineMapper<Accounting>();

    final LineTokenizer accountingLineTokenizer = createAccoutingLineTokenizer();
    accountingLineMapper.setLineTokenizer(accountingLineTokenizer);

    final FieldSetMapper<Accounting> accountingMapper = createAccountingMapper();
    accountingLineMapper.setFieldSetMapper(accountingMapper);

    return accountingLineMapper;
}

/**
 * Method use to delimit field names in file set
 *
 * @return LineTokenizer the line tokenizer object containing the row and field delimiters
 */
private LineTokenizer createAccoutingLineTokenizer() {

    final DelimitedLineTokenizer accoutingLineTokenizer = new DelimitedLineTokenizer();
    //accoutingLineTokenizer.setDelimiter(Constants.FILE_ROW_SPLITTER1);
   accoutingLineTokenizer.setDelimiter(Constants.FILE_ROW_SPLITTER);
    accoutingLineTokenizer.setNames(Constants.ACCOUNTING_FIELDS.split(
            Constants.COMMA));



    return accoutingLineTokenizer;
}

/**
 * Method used to create Mapper based on the domain object
 *
 * @return the Mapper which is created based on the domain object [Accounting] targeted
 */
private FieldSetMapper<Accounting> createAccountingMapper() {
    final BeanWrapperFieldSetMapper<Accounting> accountingMapper = new BeanWrapperFieldSetMapper<Accounting>();
    accountingMapper.setTargetType(Accounting.class);

    return accountingMapper;
}

/**
 * Reads the data line by line and set to Accounting object using the LineMapper created previously
 *
 * @return Accounting object produced by parsing each line
 * @throws ParseException
 * @throws UnexpectedInputException
 * @throws Exception exception while parsing the data input flat files
 */
//@Override
public Accounting read()
    throws UnexpectedInputException, ParseException {
    Accounting accounting = null;
    logger.debug("Accounting read() - Starts");
    long start1= System.currentTimeMillis();
  // System.out.println("read:::::::::::::::starts"+start1);
    try {
        accounting = reader.read();
    } catch (final Exception ex) {
         logger.error(
                "Error is occured during reading the Accounting file " +
                ex.getMessage(), ex);
         ex.printStackTrace();
    }
    long end = (System.currentTimeMillis()-start1);
    logger.debug("Accounting read() - Ends in milli seconds"+end);
   // System.out.println("read:::::::::::::::ends"+end);
    return accounting;
}

/**
 * Sets FlatFileItemReader as the reader to read input flat file [resource]
 * Also the job start timestamp is set in the execution context.
 *
 * @param stepExecution  the <code>StepExecution</code> object
 */
@BeforeStep
public void beforeStep(final StepExecution stepExecution) {
    reader = new FlatFileItemReader<Accounting>();
    reader.setResource(resource);
    reader.setLinesToSkip(1);
    reader.setEncoding("ISO-8859-1");

    final LineMapper<Accounting> accountingLineMapper = createAccountingLineMapper();
    reader.setLineMapper(accountingLineMapper);
    reader.open(executionContext);
    stepExecution.getExecutionContext()
                 .putString(Constants.STARTTIME,
        SolrUtils.getCurrentTimeStamp());

}

当它作为独立触发时,相同的代码花费的时间更少 . 我们计划在修复当前问题后使用分区以提高性能 .

10个记录的样本,它需要时间:<37个调用> com.tms.incentives.batch.process.itemreader.AccountingReader.read()90 9 10 java.util.LinkedHashMap $ LinkedHashIterator.hasNext()LinkedHashMap.java 90 0 90 334267

回答(0)