Inputs API¶
This module handles the discovery and validation of input sequencing files.
seqnado.inputs.Metadata
¶
Bases: BaseModel
Metadata for samples. Optional fields can be set to None.
set_mcc_defaults
¶
set_mcc_defaults() -> Self
Set default consensus_group for MCC assay.
Source code in seqnado/inputs/core.py
103 104 105 106 107 108 | |
seqnado.inputs.FastqCollection
¶
Bases: BaseFastqCollection
Represents a collection of sequencing samples (FASTQ files) grouped into named sets, with optional per-sample metadata.
Attributes:
| Name | Type | Description |
|---|---|---|
fastq_sets |
list[FastqSet]
|
List of FastqSet objects (paired or single-end samples). |
metadata |
list[Metadata]
|
List of Metadata objects corresponding one-to-one with fastq_sets. |
fastq_pairs
property
¶
fastq_pairs: dict[str, list[Path]]
Returns a dictionary mapping sample names to their FASTQ file paths.
validate_non_ip_assay
classmethod
¶
validate_non_ip_assay(v: Assay) -> Assay
Ensure the assay doesn't require IP (immunoprecipitation).
Source code in seqnado/inputs/fastq.py
361 362 363 364 365 366 367 368 369 | |
query
¶
query(sample_name: str) -> FastqSet
Retrieve the FastqSet by its sample name.
Raises:
| Type | Description |
|---|---|
ValueError
|
If sample_name not found. |
Source code in seqnado/inputs/fastq.py
418 419 420 421 422 423 424 425 426 427 428 | |
is_paired_end
¶
is_paired_end(uid: str) -> bool
Check if the given sample ID is paired-end.
Source code in seqnado/inputs/fastq.py
430 431 432 433 434 | |
from_fastq_files
classmethod
¶
from_fastq_files(
assay: Assay,
files: Iterable[str | Path],
metadata: (
Callable[[str], Metadata] | Metadata | None
) = None,
**fastqset_kwargs: Any
) -> FastqCollection
Build a SampleCollection by scanning a list of FASTQ paths:
- Convert raw paths to FastqFile.
- Group by
sample_baseand sort by read_number. - Create FastqSet (single- or paired-end) for each sample.
- Generate Metadata via
metadata(sample_name), or default.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
files
|
Iterable[str | Path]
|
Iterable of file paths (strings or Path). |
required |
metadata
|
Callable[[str], Metadata] | Metadata | None
|
|
None
|
fastqset_kwargs
|
Any
|
Extra fields forwarded to FastqSet constructor. |
{}
|
Source code in seqnado/inputs/fastq.py
436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 | |
from_directory
classmethod
¶
from_directory(
assay: Assay,
directory: str | Path,
glob_patterns: Iterable[str] = (
"*.fq",
"*.fq.gz",
"*.fastq",
"*.fastq.gz",
),
metadata: (
Callable[[str], Metadata] | Metadata | None
) = None,
**kwargs: Any
) -> FastqCollection
Recursively scan a directory for FASTQ files and build a SampleCollection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory
|
str | Path
|
Root path to search. |
required |
glob_patterns
|
Iterable[str]
|
Filename patterns to include. |
('*.fq', '*.fq.gz', '*.fastq', '*.fastq.gz')
|
metadata
|
Callable[[str], Metadata] | Metadata | None
|
Callable(sample_name) → Metadata or single Metadata instance. |
None
|
**kwargs
|
Any
|
Extra fields converted directly to a shared Metadata. |
{}
|
Source code in seqnado/inputs/fastq.py
488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 | |
to_dataframe
¶
to_dataframe(validate: bool = True) -> pd.DataFrame
Export the design to a pandas DataFrame, validated by DataFrameDesign.
Columns: sample_name, r1, r2, plus all metadata fields.
Source code in seqnado/inputs/fastq.py
510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 | |
from_dataframe
classmethod
¶
from_dataframe(
assay: Assay,
df: DataFrame,
validate_deseq2: bool = False,
assay_for_validation: Assay | None = None,
**fastqset_kwargs: Any
) -> FastqCollection
Build a SampleCollection from a DataFrame, validated by DataFrameDesign.
Expects columns: sample_name, r1, r2, plus any metadata fields.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
assay
|
Assay
|
The assay type |
required |
df
|
DataFrame
|
DataFrame with sample metadata |
required |
validate_deseq2
|
bool
|
If True, require deseq2 field to be non-null (for RNA assays) |
False
|
assay_for_validation
|
Assay | None
|
Assay type to check in validation context |
None
|
**fastqset_kwargs
|
Any
|
Additional kwargs for FastqSet |
{}
|
Source code in seqnado/inputs/fastq.py
561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 | |
seqnado.inputs.BamCollection
¶
Bases: BaseCollection
Collection of BAM files with optional per-sample metadata.
Provides convenience constructors analogous to SampleCollection but
without paired-end logic.
from_dataframe
classmethod
¶
from_dataframe(
assay: Assay,
df: Any,
validate_deseq2: bool = False,
assay_for_validation: Assay | None = None,
**kwargs: Any
) -> BamCollection
Build a BamCollection from a DataFrame.
Expects columns: sample_id, bam, plus any metadata fields.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
assay
|
Assay
|
The assay type |
required |
df
|
Any
|
DataFrame with sample metadata |
required |
validate_deseq2
|
bool
|
If True, require deseq2 field to be non-null (for RNA assays) |
False
|
assay_for_validation
|
Assay | None
|
Assay type to check in validation context |
None
|
**kwargs
|
Any
|
Additional kwargs |
{}
|
Source code in seqnado/inputs/bam.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | |