RNA sequencing reveals a complete but an unconventional type of dosage compensation in the domestic silkworm Bombyx mori

Sex chromosomal dose difference between sexes is often normalized by a gene regulatory mechanism called dosage compensation (DC). Studies indicate that DC mechanisms are generally effective in XY rather than ZW systems. However, DC studies in lepidopterans (ZW system) gave bewildering results. In Manduca sexta, DC was complete and in Plodia interpunctella, it was incomplete. In Heliconius species, dosage was found to be partly incomplete. In domesticated silkmoth Bombyx mori, DC studies have yielded contradictory results thus far, showing incomplete DC based on microarray data and a possible existence of DC based on recent reanalysis of same data. In this study, analysis of B. mori sexed embryos (78, 96 and 120 h) and larval heads using RNA sequencing suggest an onset of DC at 120 h. The average Z-linked expression is substantially less than autosomes, and the male-biased Z-linked expression observed at initial stages (78 and 96 h) gets almost compensated at 120 h embryonic stage and perfectly compensated in heads. Based on these findings, we suggest a complete but an unconventional type of DC, which may be achieved by reduced Z-linked expression in males (ZZ). To our knowledge, this is the first next-generation sequencing report showing DC in B. mori, clarifying the previous contradictions.

Sex chromosomal dose difference between sexes is often normalized by a gene regulatory mechanism called dosage compensation (DC). Studies indicate that DC mechanisms are generally effective in XY rather than ZW systems. However, DC studies in lepidopterans (ZW system) gave bewildering results. In Manduca sexta, DC was complete and in Plodia interpunctella, it was incomplete. In Heliconius species, dosage was found to be partly incomplete. In domesticated silkmoth Bombyx mori, DC studies have yielded contradictory results thus far, showing incomplete DC based on microarray data and a possible existence of DC based on recent reanalysis of same data. In this study, analysis of B. mori sexed embryos (78, 96 and 120 h) and larval heads using RNA sequencing suggest an onset of DC at 120 h. The average Z-linked expression is substantially less than autosomes, and the male-biased Zlinked expression observed at initial stages (78 and 96 h) gets almost compensated at 120 h embryonic stage and perfectly compensated in heads. Based on these findings, we suggest a complete but an unconventional type of DC, which may be achieved by reduced Z-linked expression in males (ZZ). To our knowledge, this is the first next-generation sequencing report showing DC in B. mori, clarifying the previous contradictions. DC in ZZ/ZW species. However, in another lepidopteran species Plodia interpunctella [32], DC was demonstrated to be incomplete as it showed just over half of average expression of female Z-linked genes to that of males. Recently, in Heliconius (butterflies), the DC was reported to be imperfect or not complete, due to a consistent male-biased expression of Z-linked genes in various tissues tested [33]. Thus, a mixed pattern of DC mechanisms can be observed in female heterogametic species.
DC in female heterogametic species is proposed to be specified to a subset of dose-sensitive Z-linked genes (incomplete DC; [13]) and in a few others, it may be operating in a chromosome-wide manner [31]. In addition, the ZZ/ZW systems would have adapted many auxiliary phenomena for assisting the primary objective of DC, for example, the enrichment of the male-biased genes on the Z chromosome in a few female heterogametic species [34][35][36], which lead to sex chromosome-biased expression (SCBE) of a set of Z-linked genes. The female heterogametic species may be taking an advantage of SCBE. One striking example would be the sex determination in ZZ/ZO system (in wild silkmoths, Antheraea assama and Ant. mylita), where ZZ are males and ZO are females. Here, the double dose of Z chromosome stands as an essential criterion for male determination, presumably due to the male-biased expression of at least a subset of Z-linked genes. Thus, in this aspect, the sex-biased expression observed in most of the ZW species may be a general norm and not a disability; moreover, their successful survival and proliferation supports this assumption. Thus, the homogametic sex chromosomes (X/Z) were considered to be highly influenced by two phenomena: (i) the SCBE and (ii) the DC mechanism. It is not very clear whether SCBE and DC have coevolved or DC is followed by SCBE or vice versa [37]. The SCBE of X (Z) chromosome, achieved by an 'in and out' gene trafficking [38,39], may strengthen the sexually dimorphic traits; besides, this can be perceived as a counter-attacking force of DC.
The initial reports for DC in B. mori based on the expression profile of a set of a few Z-linked genes have suggested an incomplete DC, where males showed higher expression of Z-linked genes [40,41]. This was further confirmed by global, microarray analysis [29]. However, another study using reanalysis of the same microarray data draws a clue for the possibility of a globally operating DC mechanism in B. mori [42]. RNA-seq was proved to be an efficient tool in addressing DC on a genome-wide scale [23,24,[31][32][33]. There are two recent reports (based on RNA-seq), one for the involvement of a Z-linked gene called masc in B. mori DC [43] and another for the existence of an incomplete dosage in very early stage, and most of the Z-linked genes are dosage-compensated by 78 h of development [44]. Furthermore, in Bombyx, the process of sex determination is governed by the differential splicing of doublesex gene, Bmdsx. It has two splicing isoforms, Bmdsxf in females and Bmdsxm in males. These isoforms produce differential proteins, having antagonistic functions in sexes thus inducing sexual differentiation. Based on the Bmdsx splicing in eggs at various developmental stages, it is found that Bmdsxf splicing isoform is predominant at 12 h of development. This could be due to the maternal deposition of Bmdsxf pre-mRNA [45]. In the same study, at 24 h of development, there is a shift of splicing from female to male form, indicating the endogenous expression of Bmdsx mRNA [45]. However, in this study, few eggs showed an equal expression of Bmdsxf and Bmdsxm isoforms; probably these are female eggs. As the development progresses, there is a shift in splicing from the predominant or equally expressing Bmdsxm isoform to Bmdsxf isoform significantly in female eggs between 29 and 32 h. Based on these results, it was determined that the sex of the embryo gets fixed in between these developmental time points (29-32 h) [45]. However, this analysis was done in non-diapause strains [45].
In the current study, we did RNA-seq for three different embryonic stages of a diapausing bivoltine strain (78, 96 and 120 h) and fifth-instar larval heads. The relative expression of Z-linked genes to that of autosomal (Z : A) and the Z-linked genes between sexes (male : female, M : F) were analysed between male and female samples. In bivoltine strain, the eggs undergo diapause and are not hatched in 10 days. Hence, the eggs have to be acid-treated to break the diapause. It is also known that the development of these eggs is comparatively slower than the non-diapause eggs; hence, there is a possibility of delayed development in the eggs. In diapause eggs, we have found a similar pattern of Bmdsx splicing shift in male and female embryos, however, at later stages of development, i.e. at 78, 96 and 120 h. We have used fifth-instar larval heads as a reference sample (DC is expected to be established) for the egg samples. Based on the differential splicing of Bmdsx, we consider the 78 and 96 h stages to be before sex-determination stages and 120 h to be after sex-determining stage. Hence, the sex is determined in between the 96 and 120 h stage of development in diapause eggs of B. mori (figure 1a). However, in both the studies (diapause and non-diapause strains), it was found that it is the differential splicing of Bmdsx that occurs first followed by the advent of DC in B. mori. From these observations, we infer that though the time points for Bmdsx differential splicing and DC vary between non-diapause and diapause strains, the patterns of the occurrence of these processes are sequential and are comparably similar.

Z : A ratios: relative expression of Z chromosome
The Z : A ratio is informative about how DC is achieved in organisms. It provides an autosomal relative expression of Z-linked genes [13]. The Z : A ratio values of 0.5, 1 and 2 correspond to half, equal and double expression of Z-linked genes, respectively, to that of autosomes. The mean-(using FPKM data) and median-based (using log 2 -transformed FPKM) Z : A ratios were estimated from the 'true expression dataset' (FPKM = 0) (figure 1b and     of autosomal genes [43]. This suggests the role of Z-linked genes in influencing the expression of several autosomal genes. Further in B. mori Z-linked gene, masc controls the differential splicing of Bmdsx [43] that regulates sexual dimorphism and differentiation genetic network by influencing the expression of numerous autosomal genes [46]. In a similar way, various Z-linked genes may be globally influencing the expression levels of various autosomal genes.   Furthermore, we estimated the difference between the M : F distributions for A-and Z-linked genes. We found a significant difference between these distributions at 78 h (MWU, p = 0.005881), 96 h (MWU, p < 2.2 × 10 −16 ) and 120 h (MWU, p = 1.588 × 10 −9 ) but not for the head samples (MWU, p = 0.4517), suggesting that DC is apparent in head sample. The differences in the M : F distribution plots were also reflected in the Z : A ratio of medians (table 3); this ratio at 78 h (1.1) represents a slightly malebiased expression of Z-linked genes and a profoundly male-biased expression at 96 h (1.5); female-biased expression at 120 h (0.89) and an almost unbiased expression in head (1.02) samples ( figure 5 and table 3). Additionally, we did quartile-based max of male-or female-paired data analysis [33] for the Z-linked genes from the dataset used for M : F distribution analysis (figure 5) and showed a consistency in the observed biased expressions of Z-linked genes at various magnitudes of gene expression ( figure 6).
The expression of Z-linked genes (log 2 FPKM) was compared between sexes by using a heatmap ( figure 7). The genes with a relatively higher expression level showed a male-biased expression at early developmental stages, 78 and 96 h (cluster 9, 212 genes). The lowly expressing genes showed a femalebiased expression (cluster 2, 222 genes), exclusively at 78 and 96 h stages, and also showed an overall male-biased to -unbiased expression at 120 h stage. The cluster of genes having a very high level of expression (cluster 4, 61 genes) showed an almost unbiased expression in all the samples. Almost all the clusters appeared to be compensated by having a similar level of expression between sexes at 120 h stage, except for the mosaic cluster (cluster 7, 125 genes). In the head samples, almost all of the Z-linked genes were found to be dosage-compensated, except for a very few local effects.
Finally, to evaluate the profound male-biased expression at 96 h stage from the embryonic RNAseq data, we selected and analysed five Z-linked and five autosomal gene expressions from different chromosomal locations through quantitative reverse-transcription PCR (qRT-PCR) using rp49 as an endogenous control. We chose rp49, because it is the best endogenous reference gene, with a least  stability index of 0.083 in B. mori when compared with other endogenous reference genes like E2F, actinA1, actinA3, G3PDH and GAPDH [47]. Autosomal genes of embryonic stages showed an overall equal expression in all three stages (figure 8a). All the five Z-linked genes showed a relatively higher fold expression at least in 96 h stage (figure 8b). For the head samples, the selected four autosomal (figure 8c) and four Z-linked genes (figure 8d) (based on FPKM values) were shown to be almost equally expressing in both sexes in qRT-PCR. The qRT-PCR data of selected genes support the FPKMbased relative expression from RNA-seq data analysis. Furthermore, we have found that the expression level of masc is relatively higher (6.45-fold) in male embryos compared with female embryos (electronic supplementary material, table S2) at 96 h of development, a key gene, having roles in both DC and also in sex determination.

Relatively reduced expression of Z-linked genes compared with that of autosomes
Our study showed that in B. mori, Z : A ratio is less than 1 in both the sexes, representing the reduced expression of Z-linked genes to that of autosomes. A plausible explanation for this unusually reduced Z : A ratio is the low expression of Z-linked genes compared with a major proportion of autosomes (12, which is 48.06% of the genome) (figure 9 and electronic supplementary material, table S3). The case of B. mori is similar to Heliconius species, where the expression of Z-linked genes is substantially reduced to that of autosomes [33]. But, from an evolutionary perspective, there is no currently available hypothesis to explain this reduced Z : A ratio in these species. The mechanisms of DC evolution can provide insights to answer this. It has been proposed that DC has evolved in a two-step process (Ohno's hypothesis) in Caenorhabditis and mammals [10][11][12][13][14], whereas in Drosophila, it is a direct upregulation of X-linked genes by DC complex (DCC) in males (XY) [8,9] (see Introduction). We propose that the basic element of driving force for the evolution of such diverse mechanisms with a single objective of achieving DC should be  For instance, in Drosophila, Caenorhabditis and mammals, the value of ancestral expression of proto-X might be fixed as 1 [10][11][12][13][14]. So, in males (XY), the expression level of X is 1 and in females (XX) it is 2. Hence, in these species, the DC force would have evolved by choosing one of the possible/suitable paths like 'DCC-mediated X upregulation' or 'two-step process' (Ohno's hypothesis) to attain the destined expression level of 2 for X chromosome(s) in males XY = 2 and females XX = 2, which matches the value of autosomes (AA = 2). If the ancestral expression of proto-X/Z gets fixed as approximately 0.6 or so in a species, then the DC evolutionary pressures may choose a different path like, 'repression of ZZ expression in homogametic sex' for achieving DC. For example, in the case of B. mori, the Z : A ratios were approximately 0.6 in both sexes (table 1), but the M : F ratios for both A-and Z-linked genes are approximately 1, representing the equally expressing A-and Z-linked genes in both sexes (electronic supplementary material, table S4). Based on these observations, we speculate that the DC evolutionary pressures in B. mori would have selected for the repression of Z in homogametic sex, in order to achieve the DC faster. Thus, in short, the 'ancestral expression of proto-sex chromosomes' is one of the crucial determinants of the X : A or Z : A ratio, which in turn drives the DC evolutionary pressures to choose the path of DC mechanism (ancestral expression of proto-sex chromosomes > X : A or Z : A ratio > DC path).

Sex-biased expression of Z at early embryonic stages
The M : F ratio distributions imply the sex-biased expression of genes for the samples. In our study, these ratios suggest a clear male-biased expression of Z-linked genes at 96 h and a female-biased expression at 120 h (  at 120 h probably indicates the onset of DC effect probably initiated after 96 h and executed at the later stage, 120 h of development. A significant male-biased expression of A-and Z-linked genes being observed at the early stages of 78 and 96 h gets normalized at a later stage of 120 h, suggesting the advent of DC mechanism. This scenario clearly indicates the primary objective of DC mechanism to equalize the expression differences of A-and Z-linked genes between sexes [20]. In head samples, Z-linked genes showed an almost unbiased expression suggesting its compensation. By contrast, a slightly male-biased expression (statistically significant, table 3) of Z at 78 h stage may be initial, and we presume that this stage might not be representing a relative full-fledged Z-linked gene expression, based on comparatively lower Z : A ratios at 78 h stage (figure 2 and table 2). However, from the heatmap, it is evident that a large number of genes with higher expressions showed a male-biased expression at 78 and 96 h stages. These genes shift from a male-biased to -unbiased expression (based on similar colour schema observed between sexes) at the later stage of 120 h. This indicates the advent of dosage compensatory effect over these genes at this stage and also suggests that the onset of DC can be considered to initiate post 96 h. In the head samples, except for a very small subset of genes, almost all the genes showed a comparable level of expression, indicating the established DC. Although there is only evidence of Z suppression in males [43], the clue obtained from the observation of a relatively increased expression of genes with higher expression (figure 7, cluster 9) in females, compared with male embryos at 120 h stage, suggests the existence of Z hyperactivation in females. At 78 and 96 h stages, the genes with lower expression (figure 7, cluster 2) were female-biased and turned out to be male-biased at the later stage of 120 h stage on an overall basis, suggesting an inverse effect of DC mechanism over this subset (lower expression, figure 7, cluster 2) of genes.
From this study, we have identified an embryonic stage (120 h), at which the effect of DC comes into action (figure 2). Gene-wise comparisons at 96 h showed a profound male-biased Z-linked gene expression, which gets counter-attacked by DC at 120 h, exhibiting a slightly female-biased Z-linked gene expression at this stage. The male-biased expression of Z-linked genes in the early stages of development (78 and 96 h) could be due to the functionally inactive, putative DCC whose initiation (post 96 h) and advent from the later stage (120 h) would have established the DC in B. mori tissues. We suggest 96 h as a crucial developmental stage both for DC, based on relatively higher expression of masc, a dosage   regulator gene in males (electronic supplementary material, table S2) and also in sex determination, due to the initiation of dsx sex-specific differential splicing (Ajimura M. et al. 2017, unpublished data; [48]). From our results, the complete DC is apparent in B. mori, depicting its essentiality for sexual fitness in this species. The complete DC is represented by an overall Z-linked gene expression parity between the sexes with a relatively reduced expression compared with autosomes, a unique trend generally not seen in the dosage-compensated taxa [12]. This type of reduced expression of ZZ in B. mori males is analogous to the downregulation of both XX transcription in hermaphrodite C. elegans, probably by increased chromosome condensation [49]. The speculated DC mechanism of C. elegans promoted by XO lethal-1 (xol-1) is similar to that of the masc [43] in the B. mori; both the genes regulate the sex determination and the DC. Both the dosage mechanisms result in the hypoexpression of the XX/ZZ chromosomes.

Dosage compensation occurs through reduced expression or hyperexpression of Z chromosome?
The emerging evidence suggests that the patterns of DC are highly variable across sex-determination system and species [14]. The DC observed in B. mori stands as a unique mechanism, which is achieved mostly through the hypoexpression of the ZZ chromosomes in males. In a few ZW species studied, male ZZ : AA expression ratios in general were reported to be approximately 1, e.g. in Aves it is 1.01 [27,28], in nematode Schistosoma mansoni it is 1.06 [30] and in lepidopteran P. interpunctella it is 0.95 [32]. Whereas, in females ZW/AA ratios have been reported to be approximately 0.5 (table 4), suggesting an incomplete DC in these species. In Heliconius species, the DC is reported to be imperfect or incomplete based on a consistent male-biased expression of Z-linked genes in various tissues tested for dosage [33]. By contrast, these ratios were almost equal between the sexes and have a value of less than 1 in the lepidopterans M. sexta (ZZ : AA = 0.83 and ZW/AA = 0.81) [31] and B. mori [ZZ : AA = approximately 0.6 (mean) and ZW/AA = approximately 0.6 (mean)], suggesting the complete DC (table 4). The median Z : A ratios (figure 2) suggest that in male embryos, the Z-linked gene expression reaches the relative expression point of approximately 0.5 first compared with females (observed first at 96 h stage). But at a later stage of development (120 h), the Z-linked gene expression was found to be slightly higher in females compared with males (figure 2), contrasting the double dose (copy number) of Z chromosomes in males. This can be explained by these possibilities: (i) the hyperexpression of single Z chromosome in females, (ii) by the hypoexpression of ZZ in males, or (iii) by a combinatorial effect of both hyper-and hypoexpressions of Z-linked genes.
A recent report suggests the involvement of a Z-linked CCCH type zinc finger gene, masc in the DC, as the RNAi of masc resulted in upregulation of Z-linked genes [43]. In this study, we observed a masc gene expression level-driven suppression of Z-linked genes in males (ZZ). The overall Z-linked gene expression difference between sexes at 96 h was found to be significantly different (figure 1). The masc gene at 96 h stage was found to be male-biased (electronic supplementary material, table S2). Being hypothesized to be involved in the mechanism of DC, the increase in the fold change of masc at 96 h suggests its crucial involvement in the hypoexpression of ZZ in males to that of the Z in females [43]. The male-biased expression of masc, especially at the dosage uncompensated stage of 96 h, denotes its probable involvement in the mechanism of DC, and this male-biased masc expression would have led to the hypoexpression of ZZ at the immediate next embryonic stage of 120 h where the Z dosage was compensated ( figure 6). As the masc gene was also shown to be involved in the mechanism of DC by suppression of ZZ in males, its male-biased expression at 96 h stage of development (electronic supplementary material, table S2) probably suggests its crucial involvement in DC [43]. The overall Z-linked gene expression is slightly female-biased at 120 h stage (figure 7), this may be due to 'dosage over-compensation effect', an exact opposing phenomenon to 'inverse dosage effect', observed in D. melanogaster in which there is a upregulation of single X chromosome in males due to loss of repressors [50,51]. We presume this effect as temporary, mediated by the gain of ZZ repression in males due to a tight transcriptional downregulation of the freely expressing ZZ chromosomes in males [43]. Besides, this effect may also be treated as over-representation of single female Z chromosome expression in the background of ZZ repression in males. Because of this effect, we presume that the freely expressing female Z chromosome at 120 h stage seems to be apparently higher than its male counterpart. The M : F ratio distributions (figure 5) of A and Z clearly indicate the initial absence (96 h) and probable establishment of DC in the later stage (120 h) of embryo development. Based on this finding, we speculate that the reason for lower Z : A ratio observed in 120 h male could be due to the suppressed expression of Z chromosome at this stage; high level of masc expression at 96 h may be suppressing ZZ expression in males, thereby bringing down the Z expression at 120 h stage (table 2). All these findings suggest and support the hypothesis of probable suppression of Z-linked genes in males [43].
Furthermore, there could be two possibilities of suppression of Z-linked gene expression in males: (i) by suppression of both the Z-linked loci in males; (ii) by inactivation of one of the Z chromosomes similar to the hetero-chromatinization of one of the X chromosomes by bar body formation in mammals [20]. We strongly believe in the possibility of suppression of both the Z-linked loci in males based on indirect evidence that no Barr bodies were identified in B. mori histology.
Additionally, the lepidopterans studied for DC showed mixed patterns of complete (M. sexta and B. mori), incomplete (P. interpunctella) and imperfect (Heliconius) types of DC mechanisms. Generally, it is believed that the Z chromosome synteny (scaffold or contig order) is conserved among lepidopterans [52,53]. However, several distinctly unique micro-syntenies (gene constitution) or sex chromosomal fusions may be present in various species. For example, in a few lepidopterans like Samia cynthia pryeri, the Z chromosome is fused with the 12th and 13th chromosomes suggesting the existence of neo-chromosome of Z chromosomes [54], and in Bombyx the Z chromosome is enriched with testis-specific genes [34]. Similarly, several unknown differences may be naturally occurring among the lepidopteran Z chromosomes. These structural variations in sex chromosomes with respect to gene constitution might ultimately influence the DC evolutionary forces to choose different paths, thus yielding differential patterns of DC in lepidopterans. It is interesting to note that in Drosophila, a male heterogametic species, DC is achieved by the hypertranscription of single X chromosome in males [8], and in Bombyx, a female heterogametic species, the DC is achieved by the hypo/reduced transcription of two ZZ chromosomes in males [43]. This suggests that the DC evolutionary force has chosen the counteracting/opposite mechanisms in the male and female heterogametic species for achieving the DC.
Altogether the growing data and analyses, especially using RNA-seq, in various species present a dynamic picture of patterns of DC, suggesting the initial selection of highly diverse mechanisms being adapted by the evolutionary forces with a focused objective of achieving the DC. A re-evaluation of DC in mammals and C. elegans using RNA-seq data contradicts Ohno's hypothesis, questioning our current knowledge of the sex chromosome evolution and DC mechanisms [55]. Hence, by taking the advanced and accurate RNA-seq data into consideration, there is a necessity of critical revision of current evolutionary theories on DC. A new theory should emerge in order to explain the reasons for lower expression of X or Z to that of autosomes to gain a comprehensive understanding of the sex chromosome evolution and DC mechanisms.

Conclusion
In this study, we have compared dosage of Z-linked genes in different embryonic stages between male and female sexes and showed that Z-linked gene expression dosage was not compensated in early embryonic stages. As the embryo ages and after the upregulation of masc gene, Z-linked genes in males show a lower expression, compensating with the dosage of females. We speculate that DC emerges after 96 h in male silkworm. In the embryo samples, 96 h is considered as a crucial developmental stage, at which the sex determination and differentiation are more finely tuned in the embryos. To our knowledge, this is the first report showing the initiation of DC in embryonic stages using RNA-seq data. We also show complete DC in the larval stage of B. mori by comparing male and female transcriptomes of the head tissue. In B. mori, the type of complete DC observed is ZZ : AA = Z : AA < 1, which is very distinct to that of XX/AA = X/AA = 1. Further studies have to be conducted to confirm whether this compensation is through silencing of one of two Z chromosomes in males or through the reduced expression of genes in both the copies of Z chromosome.

Sample collection, library preparation and RNA-sequencing
Two W-chromosome mutant strains of B. mori were used: (i) Japanese sex limited (JPSL) for the sexed embryo collection and (ii) sex-limited strain (QGSLO) for the sexed larval heads. In JPSL, the translocation of chromosome 10 fragment harbouring the kynurenine monooxygenase gene on to the Wchromosome is believed to be responsible for the development of dark brownish serosal pigmentation, which acts as a visible marker to differentiate female embryos as early as 36 h. In the QGSLO strain, female larvae can be easily distinguished by distinct crescent-shaped markings on the dorsal side of larva from the fourth-instar stage. For the embryo collection, JPSL strain moths were set for 4 h mating at room temperature, transferred to 4°C overnight, followed by depairing and were set in dark at room temperature for 2 h for a uniform egg laying. Collected eggs were cold acid-treated (one of the methods of breaking diapause) at an age of 20 h to break the diapause and were thoroughly washed under running tap water and incubated at 25°C for the development to proceed. At 36 h, male and female eggs were segregated based on the serosal pigmentation. Sampling of 200 each of male and female embryos was done at 78, 96 and 120 h. For the head tissue collection, larvae of (QGSLO) fifth-instar 5th day were numbed on ice for 30 min, decapitated, 10 male and 10 female larval heads were pooled and snap-frozen in liquid nitrogen until use.
For collecting BmN cells, log-phase cells that are regularly passaged in TC-100 (Sigma) insect media with 10% FBS (Gibco) were selected, slogged, pelleted in PBS and stored at −80°C until use. From the collected samples, total RNA was isolated and on-column DNAseI treated for the removal of genomic DNA contamination using the Direct-Zol RNA isolation kit (Zymo Research). RNA libraries were prepared following the TruSeq RNA sample preparation kit v2 (catalogue no. RS-122-2001) from Illumina using 1 µg of mRNA. Sequencing was performed on an Illumina 1000 HiSeq platform (C-CAMP, Bangalore). In total, not less than approximately 60 million pair-end reads of 100 bp, for each of the sexed embryonic stages and head samples, were generated.

RNA-sequencing read quality filtration, mapping and data analyses
The RNA-seq read quality was assessed using the package FastQC [56]. The adapter removal and quality trimming was performed using Trimmomatic v. 0.35 [57]. The leading and trailing low quality or N bases were removed below quality 3, reads were scanned with a 4-base sliding window and clipped when the average quality per base dropped below 20, and read sequences below 30 bases in length were dropped. The filtered paired-end reads were mapped against the B. mori genome sequence and its annotations (downloaded from Ensembl release 29 (2015) (GCA_000151625.1.29) [58]) using Bowtie 2 v. 2.2.6 [59,60] with default parameters. The aligned reads were filtered to keep only uniquely mapped reads. SAM/BAM conversions, sorting, indexing and filtering were performed with SAMtools v. 1.2. [61]. The alignment files (SAM format) so obtained were imported into Seqmonk software [62]. While importing the SAM files into Seqmonk, the libraries were treated as pair-end and duplicate reads were eliminated. The log 2 -transformed FPKMs for genes were quantitated, by correcting for DNA contamination, transcript isoforms were merged, transcript length correction was made, besides excluding the genes with no or very low read counts (considered as noise in Seqmonk) to avoid bias in the data that might skew the analysis. The raw read counts table and quantitated genes report (FPKM values) of Seqmonk analysis were exported to Excel (Microsoft) and further analysed. Moreover, for B. mori, as the GFF file from Ensembl was not annotated with the chromosome number, which is required for current analysis. So, to overcome this, the 'description' code of each gene in the GFF file was replaced with its corresponding chromosome number, based on the scaffold identity from KAIKObase [63] annotation data using custom shell scripts.

Statistics for Z dosage and data representation
Based on the scaffold mapping, the genes were grouped into autosomal (A)-and Z-linked (Z) genes. The unmapped genes were excluded from further analysis. Similar to a conventional dosage analysis, the dosage in B. mori was tested by two estimates. They are (i) Z : A ratios (autosomal relative expression of Z-linked genes) and (ii) M : F ratios (sex-biased expression of A and Z-linked gene expressions). To assess the Z dosage effects in the samples, the ratio of autosomal and Z-linked gene expression (Z : A) was calculated within each sample (male and female), and this ratio was compared between the sexes. The true expression (FPKM = 0) of all the genes were estimated by choosing the option, 'Don't quantitate probes with no counts' (probes = mRNA in our analysis) in Seqmonk during probe quantitation. The Z : A ratios were estimated using this 'true expression dataset' (FPKM = 0) and resulted in mapping of approximately 10 000 genes for embryo samples and approximately 7000 genes for head samples. The mean and median Z : A ratios were estimated, which represent the relative expression level of Zlinked genes compared with that of autosomal. Bootstrapping (10 K) of log-transformed FPKM data was performed to find the 95% confidence interval for the Z : A point estimate of the median, using the online web tool STATKEY [64]. We compared the median level expression differences between autosomal and Z-linked genes within the sex for each sample individually by non-parametric MWU test (Wilcoxon rank-sum test), conducted in R package [65]. Furthermore, the M : F ratio distributions were calculated from (i) FPKM values (electronic supplementary material, file S1) and (ii) raw counts data (electronic supplementary material, files S2-S5) of genes that showed expression (FPKM = 0) in both sexes. Furthermore, the genes with less than four raw read counts in at least one of the samples out of four (comparison: 2 male technical replicates × 2 female technical replicates) were removed and the remaining genes were subjected to the trimmed mean of M-values (TMM) normalization method (electronic supplementary material, file S6 and for script electronic supplementary material, file S7) for the estimation of M : F ratio distributions [66]. The log 2 (M : F) density distributions were generated using Wessa.net histogram, online web tool [67] to reveal the overall picture of the sexbiased distribution of the autosomal and Z-linked genes in the samples. The median level differences between the M : F distributions for autosomal and Z-linked genes were also tested by the MWU test. To compare and explore any discrepancies found between male and female median Z-linked genes at various levels of gene expressions, an unpaired comparison for the Z-linked gene expression data between sexes was conducted. For this, the Z-linked log 2 FPMK distribution data for both the sexes were sorted independently in a descending order and were divided into quartiles (Q1-Q4), each representing different magnitudes of gene expression (high-Q4, medium-Q3, low-Q2 and very low-Q1 expressing genes) (figure 3; for script, electronic supplementary material, file S7). For the quartile-based max of male or female data analysis, the Z-linked genes were segregated as paired data and split into quartiles (figure 6; for script, electronic supplementary material, file S7). The median expression difference within quartiles between sexes was tested by the MWU test. Additionally, to view the profile of Z-linked gene expression, all the Z-linked genes were filtered and saved as annotation track from which a heatmap was generated based on clustering. For generating heatmap in Seqmonk, even the loci with no true expressions were also quantitated.

cDNA preparation
The RNA concentration of all the samples was measured using Nanodrop 2000 (Thermo Scientific). cDNA was synthesized from 1 µg of total RNA using SuperScript™ III (Invitrogen), following the manufacturer's instructions. Briefly, 1 µg of total RNA, 1 µl of 10 mM dNTPs, 1 µl of 12 nucleotide oligodT primer and dH 2 O to 13 µl were added in a tube and incubated at 65°C for 5 min, followed by freezing the mixture on ice for 1 min. To this mixture, 1 µl of 0.1 M dithiothreitol, 4 µl of 5× buffer, 1 µl of SuperScript enzyme and 1 µl of RNaseOut were added, mixed with pipette and incubated at 50°C for 1 h followed by stopping the reaction at 75°C for 15 min.

Quantitative RT-PCR
The relative expression of selected genes (see electronic supplementary material, table S5 for primer sequences) was validated through qRT-PCR (ABI 7500). The reaction was set using SYBR Premix Ex Taq (Tli RNaseH Plus) from Takara Bio, Inc. The reaction mixture included cDNA sample of 3 µl (diluted to 10 ng µl −1 ), 0.2 µM primers in a final volume of 20 µl of master mix. Reaction conditions were: 95°C for 30 s, 95°C for 5 s and 60°C for 34 s. The standard curve analysis was done using ABI SDS software, v. 1.2.3. The reactions were carried out in triplicates and the relative expression was determined using C t analysis. Fold change values for male samples relative to female samples (calibrator) were obtained by normalizing the gene expression values to the rp49 as endogenous/internal reference control separately [47].
Data accessibility. All sequence data are available through BioProject ID: PRJNA388026.