Anti-cancer and antimicrobial potential of five soil Streptomycetes: a metabolomics-based study

Lack of new anti-cancer and anti-infective agents directed the pharmaceutical research to natural products' discovery especially from actinomycetes as one of the major sources of bioactive compounds. Metabolomics- and dereplication-guided approach has been used successfully in chemical profiling of bioactive actinomycetes. We aimed to study the metabolomic profile of five bioactive actinomycetes to investigate the interesting metabolites responsible for their antimicrobial and anti-cancer activities. Three actinomycetes, namely, Streptomyces sp. SH8, SH10 and SH13, were found to exhibit broad spectrum of antimicrobial activities, whereas isolate SH4 showed the broadest antimicrobial activity against all tested strains. In addition, isolates SH8, SH10 and SH12 displayed potent cytotoxicity against the breast cancer cell line Michigan Cancer Foundation-7 (MCF-7), whereas isolates SH4 and SH12 exhibited potent anti-cancer activity against the hepatoma cell line hepatoma G2 (HepG2) compared with their weak inhibitory properties on the normal breast cells MCF-10A and normal liver cells transformed human liver epithelial-2 (THLE2), respectively. All bioactive isolates were molecularly identified as Streptomyces sp. via 16S rRNA gene sequencing. Our actinobacterial dereplication analysis revealed putative identification of several bioactive metabolites including tetracycline, oxytetracycline and a macrolide antibiotic, novamethymycin. Together, chemical profiling of bioactive Streptomycetes via dereplication and metabolomics helped in assigning their unique metabolites and predicting the bioactive compounds instigating their diverse bioactivities.

Lack of new anti-cancer and anti-infective agents directed the pharmaceutical research to natural products' discovery especially from actinomycetes as one of the major sources of bioactive compounds. Metabolomics-and dereplication-guided approach has been used successfully in chemical profiling of bioactive actinomycetes. We aimed to study the metabolomic profile of five bioactive actinomycetes to investigate the interesting metabolites responsible for their antimicrobial and anti-cancer activities. Three actinomycetes, namely, Streptomyces sp. SH8, SH10 and SH13, were found to exhibit broad spectrum of antimicrobial activities, whereas isolate SH4 showed the broadest antimicrobial activity against all tested strains. In addition, isolates SH8, SH10 and SH12 displayed potent cytotoxicity against the breast cancer cell line Michigan Cancer Foundation-7 (MCF-7), whereas isolates SH4 and SH12 exhibited potent anti-cancer activity against the hepatoma cell line hepatoma G2 (HepG2) compared with their weak inhibitory properties on the normal breast cells MCF-10A and normal liver cells transformed human liver epithelial-2 (THLE2), respectively. All bioactive isolates were molecularly identified as Streptomyces sp. via 16S rRNA gene sequencing. Our actinobacterial dereplication analysis revealed putative identification of several bioactive metabolites including

Introduction
Various natural product sources have significantly contributed to medication discovery and development especially in the treatment of different cancers and infections [1,2]. More than 40% of the drugs approved as anti-infective or anti-cancer agents in the period from 1981 to 2014 were based on either natural products or their derivatives [2]. However, the chemical complexity of the bioactive secondary metabolites as well as supply problems from their biological sources slowed down the pharmaceutical research in the area of natural products' discovery in comparison with synthetic drugs [1]. However, the number of reports of antibiotic-resistant microorganisms in both community and clinical settings is growing with increasing rate of mortality. This increase in number of multi-drug resistant pathogens is the main challenge researchers facing to discover new antimicrobial agents with efficacy against those organisms [3]. Therefore, there is an increasing interest in discovering new natural products to combat the antibiotic resistance [4].
Metabolomics is one of the most popular and wide-ranging applications of bioinformatics, which helps understand the primary and secondary metabolism of microorganisms, plants and animals [11]. Metabolomics particularly studies all the secondary metabolites produced by a biological system [12]. In addition, dereplication is one of the most popular tools used for separating new compounds and quickly identifying secondary metabolites isolated from microbial extracts based on reported secondary metabolites in the database such as Dictionary of Natural Products (DNPs) and AntiMarin database. Therefore, dereplication studies, in addition to multivariate data analysis (MVA), are commonly used in the drug discovery programmes [11,13].
Several previous studies have employed a bioassay-guided approach for the discovery of microbial natural products in Egypt [4,14], whereas only few studies used a bioassay-and metabolomics-guided approach for the chemical profiling of bioactive actinomycetes as well as isolation of new biologically active secondary metabolites from them [15]. In the current study, we aimed to conduct chemical profiling of some soil actinomycetes which have broad antimicrobial and anti-cancer activities using a dereplication-and metabolomics-guided approach to investigate the interesting secondary metabolites responsible for isolates' bioactivities and to assign the promising metabolites which could be targeted for isolation.

Isolation of actinomycetes
Five actinomycetes were recovered from soil samples collected during the winter of 2016 from the Sherif-Pasha village in Beni-Suef Governorate, Egypt. The soil samples were collected from the top layer of agricultural soil and placed in sterile plastic bags and then transported to the laboratory for isolation. The actinobacterial isolates were isolated using a slightly modified soil serial dilution approach [16,17]. In brief, 1 g of each soil sample was diluted in 9 ml of 0.9% saline and then serially diluted with saline up to 1 × 10 −6 . Then, all tubes were mixed well using a rotatory shaker at 200 r.p.m. for 15 min. Subsequently, 1 ml from all diluted tubes was spread on starch casein agar containing nystatin (50 µg ml −1 ) to prevent fungal contamination and rifampicin (5 µg ml −1 ) to prevent bacterial contamination. Then, all inoculated plates were incubated for 7 days at 30°C. Actinomycete colonies were then characterized according to their morphology and pigmentation and then selected for further purification. Finally, pure colonies of actinomycetes were sub-cultured on starch casein agar and incubated at 30°C for 7 days, followed by storage at −80°C in 30% glycerol broth.

Antimicrobial activity screening
The antimicrobial activity of actinobacterial isolates was tested against some indicator strains, including the Gram-positive bacteria (Staphylococcus aureus (ATCC 43300), Listeria monocytogenes (ATCC 7644) and Bacillus subtilis (environmental sample)) and the Gram-negative bacteria (Escherichia coli (clinical isolate) and Salmonella enterica (ATCC 14028)) in addition to yeast (Candida albicans (ATCC 60193)) as previously described [18]. Briefly, each actinomycete was cultivated on International Streptomyces 2 Project (ISP2) agar at 30°C for 7 days. Then, 7 mm agar discs from actinobacterial growth of each isolate were removed and transferred to the surface of tryptone soya agar plates that had already been inoculated with the standard strains using sterile swabs. After pre-diffusion for 90 min at 4°C, the plates were incubated for 24 h before interpreting the antimicrobial activity results. Finally, positive antimicrobial activity of the actinomycetes was recorded as inhibition zones greater than or equal to 10 mm.

Molecular characterization of bioactive actinomycetes
Five bioactive actinomycetes were identified by 16S rRNA gene sequencing. A pure colony from each isolate was inoculated in ISP2 broth and incubated for 5 days at 30°C, after which 1 ml of each broth was transferred into a sterile Eppendorf tube and centrifuged using a benchtop micro-centrifuge at 12 000 r.p.m. for 30 min. After the removal of the supernatant, the pellets were then used for DNA extraction. A GeneJET Plant Genomic DNA Purification Mini Kit was used to harvest genomic DNA according to the manufacturer's protocol. Finally, DNA concentration and purity were measured using a Nanodrop 2000.
The 16S rRNA genes in actinobacterial genomic DNA were amplified by polymerase chain reaction (PCR) using universal primers: 27F (5 0 -AGAGTTTGATCCTGGCTCAG-3 0 ) and 1492R (5 0 -GGTTACCTTGTTACGACTT-3 0 ) [19]. Each 50 µl of PCR reaction contained 25 µl of 2× EF-Taq DNA polymerase (SolGent, Korea), 1 µl of each primer and 5 µl of DNA extract, completed with sterile distilled water for a final volume of 50 µl. PCR amplification was performed using a previously described PCR programme [20]. Gel electrophoresis was employed to confirm success of the PCR amplification by running the PCR products on a 1% (w/v) agarose gel with a 1-Kb reference ladder. The PCR products were then purified using the SolGent PCR purification kit (Daejeon, Korea) [21]. Finally, the PCR products were sent to Macrogen, Korea, for sequencing using an ABI 3730XL DNA Analyzer with BigDye Terminator v. 3.1 Cycle Sequencing Kits (Thermo Fisher Scientific, USA).
The MegaBLAST tool of the National Center for Biotechnology Information (NCBI) was used to compare the 16S rRNA gene sequences of our isolates with those in the GenBank database. Then, 20 sequences that were highly similar to our amplified 16S rRNA genes were subjected to multiple sequence alignment with our sequences, followed by the generation of a phylogenetic tree using the neighbour-joining method [22] with the Mega-X software. Finally, our bioactive actinomycete 16S rRNA gene sequences were deposited in GenBank with the accession numbers MZ027566, MZ027590, MZ027603, MZ027606 and MZ027616 for isolates SH4, SH8, SH10, SH12 and SH13, respectively.

Fermentation and extraction of secondary metabolites
The bioactive isolates were cultured from glycerol stock into starch casein agar and incubated for 7 days at 30°C. Then, a pure colony from each plate was inoculated in 500 ml Erlenmeyer flasks containing 125 ml ISP2 broth and incubated for 7 days at 30°C on a shaker incubator at 160 r.p.m. A 1 : 1 v/v high-performance liquid chromatography-(HPLC)-grade ethyl acetate solvent was subsequently added to the broth, mixed and left overnight before being filtered through a 100 mm filter paper. Next, liquid-liquid partitioning was performed twice between ethyl acetate and broth to extract all natural products. Then, the ethyl acetate was evaporated using a rotary evaporator. Finally, all residues were weighed to determine their dry weight and then were preserved for future work as previously described [23].
Briefly, Dulbecco's modified eagle medium (DMEM) containing 10% fetal bovine serum, 10 µg ml −1 insulin and 1% penicillin-streptomycin was used to maintain all cell lines in 96-well plates at 37°C and 5% CO 2 until they reached a cell density of 1.2-1.8 × 10 4 cells/well. Both cancer and normal cells were treated with serial concentrations of aqueous solution of different extracts and standard anti-cancer agent in concentration of 100, 25, 6.3, 1.6, 0.4 µg ml −1 . Then, the 96-well plates were incubated at 37°C and 5% CO 2 for 48 h before examination under inverted microscope. Untreated cells were used as control for the growth of the cell lines, while culture medium (DMEM) was used as a blank. The cytotoxic activity of different extracts and standard drug was evaluated using MTT assay. The treated and untreated cell lines were incubated with MTT for 2-4 h (depending on cell type and maximum cell density) at 37°C in a CO 2 incubator. Then, the culture media were aspirated, and the formazan product was solubilized using MTT solubilizing solution, M-8910 (10% Triton X-100 and 0.1 N HCl in anhydrous isopropanol). Finally, MTT absorbance was then quantified using a microtitre plate reader at 570 nm, and multi-well plate background absorbance was recorded spectrophotometrically at 690 nm and subsequently subtracted from the 450 nm reading [25]. For each tumour cell line, the association between viable cells and extract concentration was plotted to determine a survival curve, and IC 50 , which is the concentration of actinomycete extract causing a 50% reduction in absorbance when compared with the control value, was also calculated. The trial was done in triplicates using the standard anti-tumour agent 5-fluorouracil.

Liquid chromatography-high-resolution mass spectrometry
Secondary metabolites in the bioactive crude extracts were chromatographically separated using highperformance liquid chromatography (Accela HPLC, Thermo Fisher Scientific) in conjunction with an Accela UV detector and Exactive (Orbitrap) mass spectrometer (Thermo Fisher Scientific) to determine their exact mass. The HPLC run was performed using an rpHPLC column (BEH C18, 2.1 × 100 mm, 1.7 µm particle size; Waters, USA) protected by a guard column (2.1 × 5 mm, 1.7 µm particle size). The mobile phase consisted of HPLC-grade water containing 0.1% formic acid (v/v) (A) and HPLC-grade acetonitrile (B) with a flow rate of 300 µl min −1 . The solvent gradient was adjusted to increase from 10% B to 100% B over 30 min and then maintained for 5 min before decreasing back to 10% B for final washing. The column temperature was set to 40°C and the mass detector to both positive and negative ionization modes. In addition, the capillary temperature was set to 270°C with a spray voltage of 4.5 kV, capillary voltage of 35 V and tube lens voltage of 110 V.

Dereplication of liquid chromatography-high-resolution mass spectrometry data
The raw mass data were split into negative and positive datasets using the MassConvert tool from ProteoWizard [26] and imported into the MZmine software in the mzML format [27], in which a multi-step protocol was followed as previously described [15,28]. Then, an in-house Excel macro was written to combine the data files of the negative and positive ionization modes generated by the MZmine software as previously described [15,28]. An in-house macro was also used to dereplicate all mass ion peaks with different natural products for our bioactive isolates in DNP [11,15]. Hits of the known natural products from the database were retrieved using ChemBioFinder version 13 (PerkinElmer Informatics, Cambridge, UK). Then, the actinobacterial hits obtained from DNP were investigated using the free online database of natural products available at http://www. knapsackfamily.com/knapsack_core/top.php.

Metabolomics data analysis of the liquid chromatography-high-resolution mass spectrometry data
The large dataset produced by the metabolomics study required multivariate analysis (MVA) for data interpretation. The liquid chromatography-high-resolution mass spectrometry (LC-HRMS) data of the bioactive isolates were thus analysed and interpreted using MetaboAnalyst 4.0, a web-based statistical analysis program. This investigation used a file comprising a table with sample name, peak list (mass-royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211509 to-charge ratio; m/z) and peak intensities exported as comma-separated values (csv). The MS peak list and intensity data were uploaded as one Zip file to the MetaboAnalyst 4.0 server (https://www. metaboanalyst.ca). Pareto scaling was employed to normalize the data. Then, univariate analysis was conducted using statistical analysis and MVA comprising the principal component analysis (PCA), hierarchical cluster analysis (HCA) and sparse partial least-squares discriminant analysis (sPLS-DA). PCA distinguished the chemical profile of all the samples from each other, whereas sPLS-DA distinguished the metabolites of all samples [29]. A heat map was generated from the LC-HRMS data of all bioactive isolates to show the intensity of significant compounds in the chemical profiles of our crude extracts.

Statistical analysis
Cytotoxicity results in the current study were statistically analysed using Excel. All IC 50 values were calculated and expressed as mean ± s.d. of the triplicate experiments. The values of IC 50 of the bacterial extracts in addition to the standard agent, 5-fluorouracil, against the tested cancer cell lines were compared with those against the normal cell lines using unpaired student's t-test to calculate the statistical significance. Values of p ≤ 0.05 were considered as statistically significant.

Antimicrobial activity screening of isolated actinomycetes
Five bioactive isolates (SH4, SH8, SH10, SH12 and SH13) were isolated, cultured in pure colonies and phenotypically identified as actinomycetes according to their colonial morphology, mycelium discoloration and pigment diffusion as well as their microscopical examination. Actinomycetes were previously reported to be prevalent in this soil [30,31]. As can be seen from table 1, four actinomycetes (SH4, SH8, SH10 and SH13) demonstrated broad spectrum of antimicrobial activity against C. albicans as well as at least one Gram-positive bacteria and one Gram-negative bacteria. It is noteworthy that isolate SH4 exhibited the broadest spectrum of antimicrobial activity against all the indicator strains tested. As mentioned in many previous reports [14,15,32,33], the soil is considered a rich source of actinomycetes with antimicrobial activities.

Molecular identification of bioactive actinomycetes
The homology search using the MegaBLAST tool in NCBI for the 16S rRNA gene sequences of the five bioactive actinomycetes revealed that all their sequences exhibited high similarity (greater than 99%) with several sequences of Streptomyces species in the GenBank. Multiple sequence alignment was performed for our sequences with several 16S rRNA gene sequences from NCBI, and a phylogenetic tree was generated using the Mega-X software. As presented in figure 1, all bioactive isolates were assigned as Streptomyces sp. considering their closeness to many sequences of different Streptomyces, confirming the previous findings from different studies that 16S rRNA is an insufficient tool for assigning exact Streptomyces species [4,14,15].

Dereplication of HRMS data
Dereplication of the detected secondary metabolites in the crude extracts can be conducted using available databases, such as DNP [37], and is widely used as a metabolomics tool in natural product discovery for the isolation of new bioactive compounds [11,15,38] and the investigation of their different bioactivities. Herein, a dereplication study was conducted to explore the anti-cancer and antibacterial capabilities of Streptomycetes in our study. Dereplication of HRMS data for the bioactive actinomycetes included in our study was performed by comparing their molecular weight and predicted molecular formula (MF) with the known compounds available in the DNP database and online database of natural products available at http://www.knapsackfamily.com/knapsack_core/top. php, to get the closest matches for our detected metabolites. This analysis revealed that most metabolites produced by Streptomycetes in our study either had no hits or no actinobacterial hits, indicating a high likelihood of isolation of novel compounds from their crude extracts. In addition, many other secondary metabolites produced by the bioactive Streptomycetes in our study were putatively identified as natural products previously isolated from other Streptomycetes (table 2 and  figure 2). Interestingly, several compounds from these putatively identified metabolites exhibited diverse biological activities, including anti-tumour, antifungal and antibacterial activities, which may explain the antimicrobial and anti-cancer activities of the bioactive Streptomycetes in our study. Several secondary metabolites produced by the bioactive isolates examined in the current study were dereplicated according to the closest matches from the screened databases as cytotoxic compounds previously extracted from Streptomycetes. For instance, the mass ion peak at m/z 554.2564 [M − H] − with a predicted MF C 34 H 37 NO 6 had the predicted match of Viridenomycin (3), a polyene lactam antibiotic isolated from Str. gannmycicus exhibiting anti-tumour activity against murine tumours (ionic liquid toxicity of 23% for P388 leukaemia and 37% for B16 melanoma) [41]. Viridenomycin was also previously extracted from Str. viridochromogenes T-24146 and was found to display potent antibacterial activity against Sta. aureus with a minimum inhibitory concentration (MIC) value below 1 µg ml −1 and B. subtilis with an MIC value of 0.05 µg ml −1 . It also exhibited antiprotozoal bioactivity against Trichomonas vaginalis with an MIC value of 0.06 µg ml −1 [42]. Moreover, the mass ion peak at m/z 213.1483 [M + H] + with predicted MF C 12 H 20 O 3 was chemically annotated as the small natural product MKN-003B (4), previously isolated from the marine actinomycete Streptoverticillium   Figure 1. Phylogenetic study of the bioactive Streptomycetes' 16S rRNA sequences. The neighbour-joining method was used to create the phylogenetic tree [22]. The evolutionary history of the taxa studied is represented by a bootstrap consensus tree estimated from 1000 replicates [34]. Next to the branches is the percentage of replicate trees in which the associated taxa clustered together in the bootstrap test [34]. The Kimura 2-parameter method was employed to calculate evolutionary distances [35]. A total of 25 nucleotide sequences were examined. 1st + 2nd + 3rd + noncoding codon locations were included. Gaps and missing data were removed from all positions. MEGA-X was used to conduct the tree's evolutionary studies [36].
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211509  [43]. MKN-004B was also isolated earlier from Streptomyces sp. M02750 and was found to exhibit antifungal activity against C. albicans [44]. In addition, the mass ion peak at m/z 467.13614 [M − H] − with predicted MF C 25 H 24 O 9 was dereplicated according to the closest match from databases to be the anthraquinone derivative Atramycin A (5), previously isolated from Str. atratus BY90 and demonstrating anti-tumour properties against P388 leukaemia cells with an IC 50 value of 4.5 µg ml −1 [45], and as the antibiotic BE 12406A (6), extracted from Streptomyces sp. BA12408 and exhibiting cytotoxic activity against P388 murine leukaemia with an IC 50 value of 0.2 µM [46]. This mass ion peak was also putatively identified as the polyketide natural product Landomycin I (7) isolated from Str. cyanogenus S-136 with selected potent cytotoxicity against the murine Lewis lung cancer LL/2 and human MCF-7 cell lines with growth inhibitory 50% (IG 50 ) values of 3.5 and 3.7 µg ml −1 , respectively [48].
Furthermore, several other secondary metabolites detected in our crude extracts were chemically annotated as bioactive antimicrobial compounds previously produced by Streptomycetes when compared with databases to get the closest predicted matches. For example, the mass ion peak at m/z 211.

Anti-cancer activity of Streptomycetes
Scientists continue to search for natural sources of anti-cancer agents, which have helped in discovering new drugs for managing tumours [10]. Our crude extracts were tested for their anti-cancer and cytotoxic effects against two cancer cell lines, HepG2 and MCF-7, and their normal cell lines, THLE2 and MCF-10A, respectively. All the tested Streptomycetes demonstrated positive anti-tumour activity against MCF-7, with weak inhibitory activity against MCF-10A ( figure 3). Of note, the isolates SH10, SH8 and SH12 exhibited potent anti-cancer activity against MCF-7, with IC 50 values of 2.22, 4.12 and 7.37 µg ml −1 , respectively ( figure 3). Contrarily, the isolates SH12, SH4 and SH10 demonstrated strong anti-cancer activity against the HepG2 cell line, with IC 50 values of 1.31, 7.27 and 9.7 µg ml −1 , respectively, compared with their inhibitory activities on the liver normal cell line THLE2 ( figure 3). Interestingly, the IC 50 values of potent bioactive extracts were either equivalent or superior to the anticancer drug 5-fluorouracil, which showed IC 50 values of 14.9 and 7.28 µg ml −1 against MCF-7 and HepG2, respectively. Our extracts' cytotoxicity assay results were consistent with those of several previously published data [10,84]. Numerous anti-tumour compounds were isolated earlier from various species of

Metabolomic profiling of the bioactive isolates
Metabolomics involves a multi-step protocol including sample preparation, instrumental analysis using tools such as LC-MS or NMR, followed by data processing and clean-up, and finally data analysis and interpretation. LC-MS and especially LC-HRMS are commonly used for instrumental analysis of samples prior to data analysis [29,37]. Dereplication and MVA are usually employed together as they constitute an excellent approach for drug discovery [11,38,87]. MVA is one of the chemometric tools for the analysis and interpretation of metabolomics data. Both unsupervised and supervised MVA are widely used for investigating variation among various sample groups in terms of m/z ratio or chemical shifts (ppm) in LC-HRMS and NMR, respectively [88].
In the current study, MVA was performed using MetaboAnalyst 4.0 to minimize the dataset size from LC-HRMS analysis, correlate the findings and present the final conclusions [37,89]. First, we implemented an unsupervised PCA method for clustering of LC-HRMS data generated for the five bioactive extracts considering its capability to decrease the MVA dimensions and investigate the chemical variation between metabolomic profiles without previous knowledge of the dataset [89]. With our PCA model, R 2 (goodness of fitness) and Q 2 (predictability) values were higher than 0.8 indicating a good performance, considering R 2 and Q 2 as application-dependent [90] and a threshold of significance for the Q 2 parameter of 0.5 [91]. As presented in figure 4, there are five total principal components from which PC1, PC2 and PC3 explained 95.3% of the total variation in the PCA model, whereas PC1 and PC2 contributed to 88.6% of the total variation in PCA.
As presented in figure 5a,b, PCA revealed the chemical variation between different metabolomic profiles of isolates regardless of their bioactivity. The two-dimensional PCA score plot for PC1 and PC2 displays the extracts of SH4 and SH8 with a different chemical profile compared with other extracts (SH10, SH12, and SH13) clustering close to each other. However, the three-dimensional PCA   [92]. Moreover, HCA was employed to show good visualization of chemical variation between crude extracts to facilitate dataset analysis using other MVA tools [29]. Here, the HCA plot was presented as a dendrogram showing two main clusters of metabolomic profiles between extracts (figure 6b). The first cluster only contained isolate SH8, indicating its chemical uniqueness in accordance with the PCA results, whereas the second cluster comprised all the other isolates. The second clustering group was further separated into two primary groups, with SH4 and SH13 in group 1 and SH10 and SH12 in group 2 (figure 6b). royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211509 To generate robust models, the supervised sPLS-DA algorithm was used to minimize the number of metabolites in high-dimensional metabolomics data. In our sPLS-DA model, three PLS components represented 80.9% of the total variation in the model. The first PLS component contributed to 34.4% of the total variation, whereas the second and third PLS components provided 25.3% and 21.2% of the total variation, respectively (figure 7a). As presented in figure 7a, chemical profiles of bioactive Streptomycetes in the three-dimensional score plot of sPLS-DA were shown in a scattered manner except for SH4 and SH13, which were close to each other. Also, SH4, SH12 and SH13 were far from SH8 and SH10 in the score plot.
The variable importance in projection (VIP) was next used to distinguish crude extracts by taking into account their most important features with the highest value indicated by sPLS-DA [29]. In the VIP plot, the 10 most important metabolites identified by sPLS-DA are presented in ascending manner based on their intensity in bioactive extracts (figure 7b). Dereplication of VIP metabolites using DNP revealed that only one metabolite was putatively identified as a known natural product, whereas the other nine ion peaks were related to unknown compounds. The mass ion peak m/z 162.1125 [M + H] + corresponding to the predicted MF C 7 H 15 NO 3 was dereplicated according to the closest match from databases to be as Deoxyvalidamine (29), which was previously isolated from Str. hygroscopicus subsp. limoneus as a glucosidase inhibitor [72].
In our study, the heat map showed the distribution of the main metabolites present in the active Streptomycetes extracts (figure 8). SH12 and SH13 were the most chemically different extracts among the actinobacterial isolates based on the intensity of significant metabolites partially matching the result of the sPLS-DA 3D score plot (figure 7a). In addition, both SH12 and SH13 were shown to be chemically different in terms of the intensity of metabolites in crude extracts, supporting the result of HCA grouping them in the same cluster under different groups (figure 6b).
The results of MVA suggested that the chemical profiles of the bioactive Streptomycetes were quite different from each other, indicating richness of their metabolomes in terms of interesting secondary metabolites. This was clear especially when implementing HCA, sPLS-DA and heat map, which supports the idea that SH12 and SH13 have unique metabolites regardless of their different intensities in SH12 and SH13 presented in the heat map and confirmed by HCA. SH4 was also shown by both PCA and sPLS-DA to have a distinct chemical profile, indicating its richness of antibacterial and anti-cancer metabolites. SH4 also demonstrated the broadest antimicrobial activity against all the tested indicator microorganisms, with a robust activity against both the HepG2 and MCF-7 cell lines. Furthermore, the dereplication metabolomics analysis of putatively identified secondary metabolites helped in understanding the antimicrobial and anti-tumour activities of the crude extracts by identifying several of its bioactive compounds with diverse activities previously isolated from Streptomycetes. royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211509 13

Conclusion
The current study highlights the importance of combining dereplication and metabolomics data analysis, including the use of supervised and unsupervised MVA, when chemically profiling Streptomycetes for isolate prioritization and chemical isolation. Implementing a dereplication-and multivariate analysisbased approach could also help in understanding how secondary metabolites underlie different biological activities. Also, conducting a metabolomics-guided approach supported by the results of dereplication and different bioassays could increase the likelihood of targeting and hence isolating new bioactive natural products from different sources, including Streptomycetes. In addition, we can see that combining metabolomics bioinformatic analysis with other databases, such as antiSMASH, which can assign the different gene clusters encoding secondary metabolites' production, could increase the chance of natural products' discovery. Moreover, it was concluded in our study that the Egyptian soil actinomycetes remain a viable source for isolating novel bioactive natural compounds, particularly using a metabolomics-based approach instead of traditional bioassay-guided techniques.
Ethics. This article does not contain any studies carried out on human participants or animals. Data accessibility.  Figure 8. Clustering of crude extracts from bioactive Streptomycetes according to the intensity of mass ion peaks of main metabolites as shown in the heat map. Red indicator is for high intensity of mass ion peaks, while blue is indicating lower intensity of mass ion peaks in the clustered crude extracts.