Quick Start
===========

After successfully installing **BIT** and configuring the reference ChIP-seq database, you are ready to use **BIT** for your analyses. The process is straightforward if you have any of the following file types indicating the chromosomal start and end positions of your epigenomic regions: `.bed`, `.narrowPeak`, `.broadPeak`, `.bigNarrowPeak` (`.bb`), or `.csv`.

Assuming your input file is located at `"file_path/file_name"` (acceptable formats: `.bed`, `.bb`, etc.), and you want to store the output in `"output_path"`, the command below will start a comprehensive comparison of your input regions against all available ChIP-seq reference datasets. It then initiates the **BIT** model using the Gibbs sampler to provide the Bayesian inference of the TR-level BIT score. The output data from each iteration will be saved as an R data object named `"file_name.rds"`.

.. code-block:: R

  > BIT(, "file_path/file_name.bed", "output_path/", show = TRUE, plot_bar = TRUE, N = 5000, bin_width = 1000, genome = "hg38")
  Loading and mapping peaks to bins...
  Done loading.
  Comparing the input regions with the pre-compiled reference ChIP-seq data, using a bin width of 1000 bps...
  Loading meta table...
  Starting alignment process...
    |==================================================| 100%
  Alignment complete.
  Starting BIT Gibbs sampler with 5000 iterations...
  0%   10   20   30   40   50   60   70   80   90   100%
  [----|----|----|----|----|----|----|----|----|----|
  **************************************************|
  Gibbs sampling completed.
  Output data saved as output_path/file_name.rds
  Loading data from file...
  Processing theta matrix and TR names...
  Compiling results...
  Results saved to output_path/file_name_rank_table.csv
  BIT process completed.


There are several default parameters you can change:

.. note::

   - **show**: `TRUE` / `FALSE`. Whether to display the ranking table. Default: `TRUE`.

   - **plot.bar**: `TRUE` / `FALSE`. Whether to plot the top 10 TRs BIT scores in a horizontal bar plot. Default: `TRUE`.

   - **format**: One of `"bed"`, `"narrowPeak"`, `"broadPeak"`, `"bigNarrowPeak"`, `"csv"`, or `NULL`. Specifies the format of the input file. Default: `NULL`. If set to `NULL`, **BIT** will automatically determine the file format based on its extension.

   - **N**: Integer. The number of iterations in the Gibbs sampler. Default: `5000`.

   - **bin_width**: Integer. The width of the bin used to divide the chromatin into non-overlapping bins. Default: `1000`. Only change this if you compile a different reference database.

   - **burnin**: `NULL` or an integer. Specifies the burn-in period when deriving the Bayesian inference for TR-level parameters. Default: `NULL`, which will use `N/2`.

   - **genome**: `hg38` or `mm10` for TR ChIP-seq data collected from different genome.

We can check the results:

.. code-block:: R

	> Results_Table <- read.csv("output_path/file_name_rank_table.csv")
	> head(Results_Table,10)
	         TR   Theta_i     lower     upper  BIT_score BIT_score_lower BIT_score_upper Rank
	1      CTCF -2.010571 -2.011593 -2.009676 0.11809745      0.11799114      0.11819079    1
	2     RAD21 -2.028610 -2.031747 -2.025619 0.11623164      0.11590978      0.11653925    2
	3      SMC3 -2.110100 -2.120542 -2.100907 0.10811898      0.10711622      0.10900866    3
	4     SMC1A -2.181305 -2.193447 -2.170246 0.10144192      0.10034048      0.10245443    4
	5      PHF2 -2.222166 -2.633869 -1.990377 0.09777755      0.06699025      0.12021697    5
	6    GABPB1 -2.301673 -2.689461 -2.062208 0.09098450      0.06359813      0.11282466    6
	7      CHD2 -2.317842 -2.370804 -2.270597 0.08965600      0.08542628      0.09358759    7
	8      NFYA -2.353494 -2.396030 -2.313238 0.08678843      0.08347594      0.09003250    8
	9    NELFCD -2.371794 -2.476460 -2.276018 0.08534898      0.07752502      0.09312872    9
	10     NFYC -2.373795 -2.767204 -2.122314 0.08519290      0.05912233      0.10694685   10