为项目子目录中的目标创建依赖关系的make规则-Java 学习之路

我的论文研究软件的源代码树（ R ）反映了传统的 research workflow ："collect data -> prepare data -> analyze data -> collect results -> publish results" . 我使用 make 来 Build 和维护工作流程（项目的大多数子目录包含 Makefile 文件） .

但是，我经常需要通过项目子目录中的特定Makefile目标执行工作流的各个部分（而不是通过顶级 Makefile ） . 这会产生一个问题，即设置 Makefile rules 以在工作流的不同部分之间维护 dependencies ，换句话说 - 位于不同子目录中 Makefile 文件中的目标之间 .

以下代表我的论文项目的 setup ：

+-- diss-floss (Project's root)
|-- import (data collection)
|-- cache (R data objects (), representing different data sources, in sub-directories)
|-+ prepare (data cleaning, transformation, merging and sampling)
  |-- R modules, including 'transform.R'
|-- analysis (data analyses, including exploratory data analysis (EDA))
  |-- R modules, including 'eda.R'
|-+ results (results of the analyses, in sub-directories)
  |-+ eda (*.svg, *.pdf, ...)
  |-- ...
|-- present (auto-generated presentation for defense)

Snippets of targets from some of my Makefile files:

“〜/ diss-floss / Makefile”（几乎已满）：

# Major variable definitions

PROJECT="diss-floss"
HOME_DIR="~/diss-floss"
REPORT={$(PROJECT)-slides}

COLLECTION_DIR=import
PREPARATION_DIR=prepare
ANALYSIS_DIR=analysis
RESULTS_DIR=results
PRESENTATION_DIR=present

RSCRIPT=Rscript

# Targets and rules 

all: rprofile collection preparation analysis results presentation

rprofile:
    R CMD BATCH ./.Rprofile

collection:
    cd $(COLLECTION_DIR) && $(MAKE)

preparation: collection
    cd $(PREPARATION_DIR) && $(MAKE)

analysis: preparation
    cd $(ANALYSIS_DIR) && $(MAKE)

results: analysis
    cd $(RESULTS_DIR) && $(MAKE)

presentation: results
    cd $(PRESENTATION_DIR) && $(MAKE)


## Phony targets and rules (for commands that do not produce files)

#.html
.PHONY: demo clean

# run demo presentation slides
demo: presentation
    # knitr(Markdown) => HTML page
    # HTML5 presentation via RStudio/RPubs or Slidify
    # OR
    # Shiny app

# remove intermediate files
clean:
    rm -f tmp*.bz2 *.Rdata

“〜/迪斯 - 牙线/进口/ Makefile文件”：

importFLOSSmole: getFLOSSmoleDataXML.R
    @$(RSCRIPT) $(R_OPTS) $<
...

“〜/ DISS-牙线/准备/生成文件”：

transform: transform.R
    $(RSCRIPT) $(R_OPTS) $<
...

“〜/迪斯 - 牙线/分析/ Makefile文件”：

eda: eda.R
    @$(RSCRIPT) $(R_OPTS) $<

目前，我担心创建以下依赖项：

通过在 import 中从 Makefile 创建目标而收集的数据总是需要通过在 prepare 中从 Makefile 创建相应的目标进行转换，然后再进行分析，例如 eda.R . 如果我在 import 中手动运行 make 然后忘记转换，在 analyze 中运行 make eda ，事情就不会太顺利了 . 因此，我的问题是：

How could I use features of the make utility (in a simplest way possible) to establish and maintain rules for dependencies between targets from Makefile files in different directories?

2 回答

1
您现在使用makefile的问题是您只将代码列为依赖项，而不是数据 . 这就是很多魔术发生的地方 . 如果“分析”知道它将使用哪些文件并且可以将它们列为依赖项，那么它可以回顾它们是如何制作的以及它们具有什么依赖性 . 如果管道中的早期文件已更新，则可以运行所有必要的步骤以使文件保持最新 . 例如
```
import: rawdata.csv

rawdata.csv:
    scp remoteserver:/rawdata.csv .

transform: tansdata.csv

transdata.csv: gogo.pl rawdata.csv
    perl gogo.pl $< > $@

plot: plot.png

plot.png: plot.R transdata.csv
    Rscript plot.R
```
因此，如果我执行 make import ，它将下载新的csv文件 . 然后，如果我运行 make plot ，它将尝试生成 plot.png 但这取决于 transdata.csv 并且取决于 rawdata.csv 并且由于 rawdata.csv 已更新，因此需要更新 transdata.csv 然后它将准备好运行R脚本 . 如果你没有错过很多make的力量 . 但是要失败，有时候在那里获得所有正确的依赖关系会很棘手（特别是如果你从一个步骤产生多个输出） .
回复于 2024-05-08T13:17:02+08:00

以下是我的想法（从@MrFlick 's answer - thank you) on adding my research workflow' s data dependencies 到项目的当前 make 基础设施（带代码片段）的一些想法 . 我还试图通过在 make targets 之间指定 dependencies 来反映所需的工作流程 .

import/Makefile:

importFLOSSmole: getFLOSSmoleDataXML.R FLOSSmole.RData
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

(similar targets for other data sources)

prepare/Makefile:

IMPORT_DIR=../import

prepare: import \
         transform \
         cleanup \
         merge \
         sample

import: $IMPORT_DIR/importFLOSSmole.done # and/or other flag files, as needed

transform: transform.R import
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

cleanup: cleanup.R transform
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

merge: merge.R cleanup
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

sample: sample.R merge
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

analysis/Makefile:

PREP_DIR=../prepare

analysis: prepare \
          eda \
          efa \
          cfa \
          sem

prepare: $PREP_DIR/transform.done # and/or other flag files, as needed

eda: eda.R prepare
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

efa: efa.R eda
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

cfa: cfa.R efa
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

sem: sem.R cfa
    @$(RSCRIPT) $(R_OPTS) $<
    @touch $@.done

目录 results 和 present 中 Makefile 文件的内容仍为TBD .

I would appreciate your thoughts and advice on the above!

回复于 2024-05-08T13:17:02+08:00

为项目子目录中的目标创建依赖关系的make规则

2 回答

相关问题