專利名稱:真菌中的核黃素生物合成的制作方法
技術(shù)領(lǐng)域:
本發(fā)明涉及真菌中核黃素生物合成的基因、其編碼的蛋白質(zhì)和用這些基因和基因產(chǎn)物制備核黃素的遺傳工程方法。
已公開了諸如阿舒假囊酵母(Eremothecium ashbyii)或棉阿舒囊霉(Ashbya gossypii)的真菌發(fā)酵制備核黃素的方法(The MerckIndex,Windholz等編,Merck & Co.p1183(1983))。
EP405370描述了核黃素高產(chǎn)細(xì)菌株,它們是通過枯草芽孢桿菌的核黃素生物合成基因的轉(zhuǎn)化獲得的。
因為在細(xì)菌和真核細(xì)胞中的核黃素生物合成遺傳學(xué)是不同的,所以上述枯草芽孢桿菌的基因不適用于用諸如棉阿舒囊霉的真核產(chǎn)生菌制備核黃素的重組方法。
于1992年11月19日在德國專利局提交的一項專利申請中,描述了釀酒醇母(Saccharomyces cerevisiae)的核黃素生物合成基因的克隆。
但不可能用釀酒酵母rib基因通過常規(guī)雜交方法來克隆棉阿舒囊霉核黃素生物合成基因;因為釀酒醇母和棉阿舒囊霉的rib基因同源性明顯不足以雜交。
本發(fā)明的一個目的是從真核細(xì)胞中分離核黃素生物合成基因,以便提供一種在真核產(chǎn)生菌中制備核黃素的重組方法。
我們發(fā)現(xiàn)通過分離見于子囊菌棉阿舒囊霉的6種基因(rib基因)可達(dá)到這一目的,這些基因編碼從GTP開始的核黃素生物合成酶。
本發(fā)明涉及下列DNA序列編碼具有SEQ ID NO2所示氨基酸序列的多肽、或編碼SEQID NO2所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
編碼具有SEQ ID NO4所示氨基酸序列的多肽、或編碼SEQID NO4所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
編碼具有SEQ ID NO6所示氨基酸序列的多肽、或編碼SEQID NO6所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
編碼具有SEQ ID NO8所示氨基酸序列的多肽、或編碼SEQID NO8所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
編碼具有SEQ ID NO10所示氨基酸序列的多肽、或編碼SEQID NO10所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
編碼具有SEQ ID NO12所示氨基酸序列的多肽、或編碼SEQID NO12所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
在序列表中顯示了這些基因和其基因產(chǎn)物(多肽)的一級結(jié)構(gòu),并編排如下SEQ ID NO1rib1基因SEQ ID NO2rib1基因產(chǎn)物(GTP環(huán)水解酶II)SEQ ID NO3rib2基因SEQ ID NO4rib2基因產(chǎn)物(DRAP脫氨基酶)SEQ ID NO5rib3基因SEQ ID NO6rib3基因產(chǎn)物(DBP合成酶)SEQ ID NO7rib4基因SEQ ID NO8rib4基因產(chǎn)物(DMRL合成酶)
SEQ ID NO9rib5基因SEQ ID NO10rib5基因產(chǎn)物(核黃素合成酶)SEQ ID NO11rib7基因SEQ ID NO12rib7基因產(chǎn)物(HTP還原酶)鳥苷三磷酸(GTP)通過GTP環(huán)水解酶II(rib1基因產(chǎn)物)轉(zhuǎn)變?yōu)?,5-二氨基-6-核糖基氨基-4(3H)-嘧啶二酮5-磷酸。然后該化合物通過rib7基因產(chǎn)物還原成2,5-二氨基核糖醇基氨基-2,4(1H,3H)-嘧啶5-磷酸,然后通過rib2基因產(chǎn)物脫氨基成為5-氨基-6-核糖醇基氨基-2,4(1H,3H)-嘧啶二酮。隨后在rib4基因產(chǎn)物催化的反應(yīng)中在該化合物上加上C4化合物DBP,生成6,7-二甲基-8-核糖醇基-2,4-二氧四氫蝶啶(DMRL),在rib5基因產(chǎn)物催化的反應(yīng)中從該化合物產(chǎn)生核黃素。C4化合物DBP(L-3,4-二羥基-2-丁酮-4-磷酸)是在rib3基因產(chǎn)物催化的反應(yīng)中從D-核酮糖5-磷酸形成的。
SEQ ID NO1、3、5、7、9、11中描述的DNA序列編碼SEQ IDNO2、4、6、8、10、12中描述的多肽。
除序列表中指出的那些外,適用的DNA序列是那些由于遺傳密碼簡并而具有不同的序列但仍編碼相同多肽的DNA序列。
本發(fā)明還涉及這樣的DNA序列,它們編碼一級結(jié)構(gòu)與序列表中不同,但仍與序列表中基因產(chǎn)物具有基本相同生物特性的基因產(chǎn)物。生物特性具體指進(jìn)行核黃素生物合成的酶活性。
這種具有基本相同生物特性的修飾的基因產(chǎn)物可通過缺失或插入一個或多個氨基酸或肽或用其他氨基酸替代氨基酸而獲得,或者可從棉阿舒囊霉以外的生物體分離獲得。
編碼修飾基因產(chǎn)物的DNA序列一般與序列表中所示DNA序列具有80%或更高的同源性。可以用SEQ ID NO1、3、5、7、9、11中所述DNA序列從棉阿舒囊霉以外的真核細(xì)胞中分離這樣的DNA序列,例如用常規(guī)雜交方法或PCR技術(shù)。這些DNA序列在標(biāo)準(zhǔn)條件下與SEQ ID NO1、3、5、7、9、11中所示DNA序列雜交。
標(biāo)準(zhǔn)條件指例如濃度為0.1-1×SSC的水性緩中液中(1×SSC0.15M NaCl,15mM檸檬酸鈉,PH7.2)42-58℃。DNA雜交的實驗條件在遺傳工程的教科書中有述,例如在Sambrook等的“Molecular Cloning”(Cold Spring Harbor haboratory,1989)中。
本發(fā)明還涉及調(diào)控序列,尤其是啟動子序列,它們位于編碼適當(dāng)多肽之DNA序列5′方向的上游。在序列表中具體給出了這些調(diào)節(jié)序列,并在下文詳細(xì)解釋。
rib1基因的調(diào)控序列SEQ ID NO1核苷酸1-242rib2基因的調(diào)控序列SEQ ID NO3核苷酸1-450rib3基因的調(diào)控序列SEQ ID NO5核苷酸1-314rib4基因的調(diào)控序列SEQ ID NO7核苷酸1-270rib5基因的調(diào)控序列SEQ ID NO9核苷酸1-524rib7基因的調(diào)控序列SEQ ID NO11核苷酸1-352還可以在5′和/或3′方向縮減這些調(diào)節(jié)序列,而其功能幾乎不降低。
調(diào)控作用的必需區(qū)一般為上述序列區(qū)域中的30-100(優(yōu)選40-70)個核苷酸的片段。
還可以通過與天然序列比較和定向誘變使這些調(diào)控序列的功能最優(yōu)化。
本發(fā)明的調(diào)控序列適于在Ashbya中過量表達(dá)基因,尤其是負(fù)責(zé)核黃核生物合成的基因。
本發(fā)明還涉及含有一種或多種本發(fā)明DNA序列的表達(dá)載體。通過向本發(fā)明DNA序列提供適當(dāng)?shù)墓δ苷{(diào)節(jié)信號而獲得這種表達(dá)載體。這種調(diào)節(jié)信號是負(fù)責(zé)表達(dá)的DNA序列,例如啟動子、操縱子、增強(qiáng)子、核糖體結(jié)合位點(diǎn),宿主生物能識別和遵守這些調(diào)節(jié)信號。
適當(dāng)時表達(dá)載體還可能包含其他如控制重組DNA在宿主生物中的復(fù)制或重組的調(diào)節(jié)信號。
本發(fā)明還涉及用本發(fā)明的DNA序列或表達(dá)載體轉(zhuǎn)化的宿主生物。優(yōu)選使用真核生物作為宿主生物,特別優(yōu)選酵母屬、假絲酵母屬、畢赤酵母屬、假囊酵母屬或阿舒囊霉屬的那些生物。特別優(yōu)選的種是釀酒酵母、假絲酵母C、flaveri、假絲酵母C、famala、阿舒假囊酵母和棉阿舒囊霉。
本發(fā)明還包括制備核黃素的重組方法,其中按常規(guī)方法發(fā)酵培養(yǎng)本發(fā)明的轉(zhuǎn)化宿主生物,從發(fā)酵培養(yǎng)基中分離發(fā)醇過程中產(chǎn)生的核黃素,和適當(dāng)時進(jìn)行純化。
可按實施例和序列表中所述分離和表征rib基因和基因產(chǎn)物。實施例1棉阿舒囊霉核黃素生物合成基因(rib基因)的分離a.棉阿舒囊霉cDNA文庫的構(gòu)建在YEPD培養(yǎng)基上培養(yǎng)核黃素高產(chǎn)菌株棉阿舒囊霉ATCC10195后,從對數(shù)生長后期的菌絲體中徹底提取RNA(sher-man等,“Methods in Yeast genetics”,cold Spring Harbor,New York,1989)。
通過在oligo(dT)-纖維素上吸附并洗脫兩次而純化Poly(A)+RNA(Aviv and Leder,Proc.Natl,Acrd,Sci,USA 69,1972,1408-1412)。用Gubler和Hoffmann的一般方法(Gene25,1983,263)分離cDNA,在平端cDNA分子的末端加上了合成的ECORI銜接頭。然后用T4多核苷酸激酶將EcoRI切割后的cDNA片段磷酸化,并克隆到已用EcoRI切割的去磷酸化載體pYEura3中(
圖1)。pYEu-ra3(Clonetech Laboratories,Inc.,California)是一種含有半乳糖可誘導(dǎo)的GAL1和GAL10啟動子和URA、CEN4和ARS1的酵母表達(dá)載體。這些酵母元件使克隆的DNA片段可以在酵母細(xì)胞中轉(zhuǎn)化和表達(dá)。
連接反應(yīng)的等分樣品用于轉(zhuǎn)化高感受態(tài)(Hanahan,DNA Chon-ing ed.D,M,Glover;IRL Press,Oxford1985,109)大腸桿菌XL1-Blue(Bullock等,Biotechniques 5(1987)376-378),基于其氨芐青霉素抗性篩選轉(zhuǎn)化子。
將3×105個氨芐青霉素抗性細(xì)胞合并及擴(kuò)增,并從中分離質(zhì)粒DNA(Birnboim and Doly,Nucleic Acids Res.7,1979,1513)。b.分離編碼核黃素生產(chǎn)酶的棉阿舒囊霉cDNA克隆。
通過參與核黃素生物合成的釀酒酵母突變體的功能互補(bǔ)分離了編碼核黃素生產(chǎn)酶的棉阿舒囊霉cDNA克隆。
菌株AJ88(Mata leu 2 his3 rib1URA3 ura 3-52)、AJ115(Matalpha leu 2 inos1 rib 2URA 3 ura 3-52)、AJ71(Matalpha leu2 inos1 rib 3URA 3 ura 3-52)、AJ106(Matalpha leu2 inos1rib4URA3 ura 3-52)、AJ66(Mata canR inos1 rib5URA3 ura 3-52)和AJ121(Matalpha leu2 inos1 rib7URA3 ura 3-52)是通過破壞釀酒酵母中參與核黃素生物合成的6個基因(rib1-rib5和rib7)之一而產(chǎn)生的突變菌株。
用來自棉阿舒囊霉cDNA文庫的25μg cDNA轉(zhuǎn)化每種這些菌株,并鋪在含半乳糖而無核黃素的固體培養(yǎng)基上。生長約1周后,從培養(yǎng)皿中分離rib+轉(zhuǎn)化子。
每次分析每種轉(zhuǎn)化突變體(Rib1+、Rib2+、Rib3+、Rib4+、Rib5+和Rib7+)的一個轉(zhuǎn)化子,發(fā)現(xiàn)Rib+表型均只在半乳糖培養(yǎng)基中表達(dá)而不在葡萄糖培養(yǎng)基中表達(dá)。
這些結(jié)果證明Rib+表型在質(zhì)粒中的半乳糖可誘導(dǎo)的GAL10啟動子控制下表達(dá)。
通過大腸桿菌的轉(zhuǎn)化從Rib1+、Rib2+、Rib3+、Rib4+、Rib5+和Rib7+轉(zhuǎn)化子分離質(zhì)粒DNA,并稱為pJR715、pJR669、pJR788、pJR733、pJR681和pJR827。
對這些質(zhì)粒中存在的cDNA插入物進(jìn)行部分測序,表明它們編碼的蛋白質(zhì)類似于酵母的rib基因產(chǎn)物的蛋白質(zhì)。c.分離編碼核黃素生成酶的棉阿舒囊霉基因組克隆為了分離棉阿舒囊霉的產(chǎn)核黃素基因的基因組拷貝,在粘粒su-per Cosl(Stratagene Cloning Systems,California)中構(gòu)建了棉阿舒囊霉ATCC 10195的基因組文庫,并用從棉阿舒囊霉ribl、rib2、rib3、rib4、rib5和rib7基因cDNA拷貝得到的32p標(biāo)記的探針進(jìn)行了篩選。
通過菌落雜交(Grunstein and Hoguess,Proc,Natl,Acad,Sci,USA 72,1975,3961-3965)分離了帶有rib1、rib2、rib3、rib4、rib5和rib7DNA的粘粒克隆。用相同的rib特異性cDNA探針對酶切粘粒進(jìn)行進(jìn)一步的Southern分析,可能鑒別出含有棉阿舒囊霉rib1、rib2、rib3、rib4、rib5和rib7基因的限制性酶切片段。
發(fā)現(xiàn)長3.1kb的Bam HI-ClaI DNA片段含有完整的編碼GTP環(huán)水解酶II的棉阿舒囊霉rib1基因。用瓊脂糖凝膠分離該片段并克隆到用Bam HI和ClaI切割的噬菌粒pBluescript KS(t)(Strata-gene Cloning Systems)中,這樣就形成了質(zhì)粒pJR765(圖2)。
獲得了一種長1329bp的DNA序列(SEQ ID NO1),它含有906bp的rib1開放讀碼框架、242bp的5′-非編碼區(qū)和181bp的3′-非編碼區(qū)。
發(fā)現(xiàn)編碼DRAP脫氨基酶的完整棉阿舒囊霉rib2基因在長3.0kb的EcoRI-PstI片段上,將此片段克隆到pBluescript ks(t)中得到質(zhì)粒pJR758(圖3)。
對長2627bp的EcoRI-PstI插入?yún)^(qū)進(jìn)行了測序,其包括1830bp的rib2開放讀碼框架、450bp的5′-未翻譯區(qū)和347bp的3′-未翻譯區(qū)(SEQ ID NO3)。
發(fā)現(xiàn)編碼DBP合成酶的完整棉阿舒囊霉rib3基因存在于長1.5kb的PstI-HindII片段上,將此片段克隆到pBluescript ks(t)中得質(zhì)粒pJR790(圖4)。
對長1082bp的PstI-HindIII插入?yún)^(qū)進(jìn)行了測序,其含有639bp的rib3開放讀碼框架、314bp的5′-未翻譯區(qū)和129bp的3′-未翻譯區(qū)(SEQ ID NO5)。
發(fā)現(xiàn)編碼DMRL合成酶的棉阿舒囊霉rib4基因存在于長3.2bP的PstI-PstI片段上,將此片段克隆到pBluescript ks(t)中得質(zhì)粒pJR762(圖5)。
對長996bp的PstI-PstI插入?yún)^(qū)進(jìn)行了測序,其包括519bp的rib4開放讀碼框架、270bp的5′-未翻譯區(qū)和207bp的3′-未翻譯區(qū)(SEQ ID NO7)。
發(fā)現(xiàn)編碼核黃素合成酶的完整棉阿舒囊霉rib5基因存在于長2.5kb的PstI-PstI片段上,將此片段克隆到pBluescript KS(t)中得質(zhì)粒pJR739(圖6)。
對長1511bp的PstI-PstI插入?yún)^(qū)進(jìn)行了測序,其包括708bp的rib5開放讀碼框架、524bp的5′-未翻譯區(qū)和279bp的3′-未翻譯區(qū)(SEQ ID NO9)。
最后,編碼HTP還原酶的棉阿舒囊霉rib7基因見于長4.1kb的EcoRI-EcoRI片段上,該片段克隆至pBluescript ks(t)中得質(zhì)粒pJR845(圖7)。
對長1596bp的EcoRI-EcoRI插入?yún)^(qū)進(jìn)行了測序,其包括741bp的rib7開放讀碼框架、352bp的5′-未翻譯區(qū)和503bp的3′-未翻譯區(qū)(SEQ ID NO11)。實施例2棉阿舒囊霉rib基因的mRNA分析進(jìn)行Northern分析鑒定rib-特異性轉(zhuǎn)錄本。從實施例1中所述棉阿舒囊霉菌株ATCC10195分離總RNA。將該菌株的RNA樣品(5μg)與RNA大小標(biāo)志物一起在0.8%瓊脂糖甲醛凝膠上電泳進(jìn)行分級分離并真空吸印在尼龍膜上(Thomas,Proc,Natl,Acad,Sci,USA,77.1980,5201-5205)。
將尼龍膜在50%甲酰胺存在下在5×SSC中42℃下分別與32P-標(biāo)記的rib-特異性DNA探針雜交。棉阿舒囊霉rib1基因表達(dá)為約1150核苷酸的單一信使,用來自質(zhì)粒pJR765的長0.7kbp的SmaI-SacI探針在兩個菌株中都檢測了mRNA。
與此相似,用長0.5kpb來自pJR758的SmaI-SmaI片段、長0.6kpb來自pJR790的HindIII-kpnI片段、長0.5kpb來自pJR739的ScaI-HindIII片段和長0.3kpb來自pJR845的PstI-PstI片段作為特異探針,在印跡中檢測到了單一的長1900核苷酸的rib2、長900核苷酸的rib3、長800核苷酸的rib4、長1050核苷酸的rib5和長1000核苷酸的rib7的轉(zhuǎn)錄本。實施例3棉阿舒囊霉rib基因在釀酒酵母中的表達(dá)如實施例1所述,在核黃素生物合成一個階段缺陷的已深入研究過的釀酒酵母突變株,如果它們帶有編碼互補(bǔ)阿舒囊霉屬酶的質(zhì)粒,則可能在沒有核黃素的培養(yǎng)基上生長。為了試驗棉阿舒囊霉rib基因產(chǎn)物的功能,在來自帶有表達(dá)質(zhì)粒pJR715、pJR669、pJR788、pJR733、pJR681和pJR827之一的釀酒酵母突變株的無細(xì)胞提取物中測定了黃素產(chǎn)生酶活性。
這些質(zhì)粒衍生自pYEura3并在實施例1中進(jìn)行了描述,它們含有半乳糖可誘導(dǎo)型GAL10啟動子控制下的棉阿舒囊霉rib特異的cDNA片段。
釀酒酵母的無細(xì)胞蛋白質(zhì)提取物得自在液體培養(yǎng)基中生長至光密度約為2OD的培養(yǎng)液。
收集細(xì)胞,用冷的20mM tris HCl(PH7.5)洗滌,并重懸于補(bǔ)有1mM苯乙基磺酰氟的相同緩中液中。
通過在有玻璃珠存在下渦旋并在4℃下3000g離心20分鐘分離細(xì)胞裂解液。
按文獻(xiàn)中所述方法測定GTP環(huán)水解酶II、DRAP脫氨基酶、DBP合成酶、DMRL合成酶、核黃素合成酶和HTP還原酶活性(Shavlovsky et al.,Arch.Microbiol.124,1980,255-259;Richter etal.,J.Bacteriol.175,1993,4045-4051;Klein and Bacher,Z.Natur-forsch.35b,1980,482-484;Richter et al.,J.Bacteriol.174,1992,4050-4056;Nielsen et al.,J.Biol. Chem.261,1986,3661;Plaut and Harvey,Methods Enzymol.18B,1971,515-538;Hollanderand Brown,Biochem. Biophys. Res. Commun.89,1979,759-763;Shavlovski et al.,Biochim. Biophys. Acta,428,1976,611-618).
用Peterson法(Anal.Biochem.83,1977,346-356)進(jìn)行蛋白質(zhì)定量。如表1中所示,質(zhì)粒pJR715在釀酒酵母突變株AJ88中表達(dá)GTP環(huán)水解酶II活性。而且,該活性只存在于生長在半乳糖培養(yǎng)基上的細(xì)胞中,這表明棉阿舒囊霉rib1 cDNA的表達(dá)發(fā)生在半乳糖可誘導(dǎo)型GAL10啟動子的控制之下。
這些結(jié)果還證明rib1在棉阿舒囊霉中編碼GTP環(huán)水解酶II。用相似方法證明了在此真菌中rib2編碼DRAP脫氨基酶、rib3編碼DBP合成酶、rib4編碼DMRL合成酶、rib5編碼核黃素合成酶及rib7編碼HTP還原酶。表1釀酒酵母rib1突變體AJ88和其轉(zhuǎn)化子的GTP環(huán)水解酶II活性。
n.d.未測到。*)野生型**)GTP環(huán)水解酶II活性的單位,1U每小時催化1nmol HTP形成表2釀酒酵母rib2突變株AJ115和其轉(zhuǎn)化子的DRAP脫氨基酶活性。
n.d.未測到。*)1U每小時催化1nmol ARAP形成。表3釀酒酵母rib3突變株AJ71和其轉(zhuǎn)化子的DBP合成酶活性
n.d.未測到。*)1U每小時催化1nmol DBP形成。表4釀酒酵母rib4突變株AJ106和其轉(zhuǎn)化子的DMRL合成酶活性
n.d.未測到。*)1U每小時催化1nmol DMRL形成。表5釀酒酵母rib5突變株AJ66和其轉(zhuǎn)化子的核黃素合成酶活性
n.d.未測到。*)1U每小時催化1nmol核黃素形成。表6釀酒酵母rib7突變體AJ121和其轉(zhuǎn)化子的HTP還原酶活性。
n.d.未測到。*)1U每小時催化1nmol DRAP形成。
序列表(1)一般信息(i)申請人(A)名稱BASF Aktiengesellschaft(B)街道Carl-Bosch-Strasse 38(C)城市Ludwigshafen(E)國家德國(F)郵編D-67056(G)電話0621/6048526(H)傳真0621/6043123(I)電傳1762175170(ii)題目真菌中的核黃素生物合成(iii)序列數(shù)目12(iv)計算機(jī)可讀形式(A)媒介類型軟盤(B)計算機(jī)IBM PC兼容機(jī)(C)操作系統(tǒng)PC-DOS/MS-DOS(D)軟件Patenf In Release#1.0,Version#1.25(EPO)(2)SEQ ID NO1的信息(i)序列特征(A)長度1329堿基對(B)類型核酸(C)鏈型雙鏈(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型cDNA到mRNA(iii)假擬否(iii)反義否(vi)原始來源(A)生物體棉阿舒囊霉(ix)特征(A)名稱/關(guān)鍵詞5’UTR(B)位置1..242(ix)特征(A)名稱/關(guān)鍵詞CDS(B)位置243..1148(ix)特征(A)名稱/關(guān)鍵詞3’UTR(B)位置1149..1329(xi)序列描述SEQ ID NO1TTTCTGTCCG CATACTTCAT ATGCTCATCG CACATTGATA ATGTACATTC GAAAAATTTC 60AAGATTAGCC TCCGTGAACA GCGATTTACC TTAGGCAAAA GTAACAAAAG GCTTTTCCGT 120AGGTGCTTTG TCATTCAACA ATCCACGTCG GAATTGGCGA CTATATAGTG TAGGGCCCAT 180AAAGCAGTAG TCGGTGTTGA TAGCTGTGTC AGACCAACTC TTTGTTAATT ACTGAAGCTG 240AT ATG ACT GAA TAC ACA GTG CCA GAA GTG AGG TGT GTC GCA CGC GCG287Met Thr Glu Tyr Thr Val Pro Glu Val Arg Cys Val Ala Arg Ala1 5 10 15CGC ATA CCG ACG GTA CAG GGC ACC GAT GTC TTC CTC CAT CTA TAC CAC 335Arg Ile Pro Thr Val Gln Gly Thr Asp Val Phe Leu His Leu Tyr His20 25 30AAC TCG ATC GAC AGC AAG GAA CAC CTA GCG ATT GTC TTC GGC GAG AAC 383Asn Ser Ile Asp Ser Lys Glu His Leu Ala Ile Val Phe Gly Glu Asn35 40 45ATA CGC TCG CGG AGT CTG TTC CGG TAC CGG AAA GAC GAC ACG CAG CAG 431Ile Arg Ser Arg Ser Leu Phe Arg Tyr Arg Lys Asp Asp Thr Gln Gln50 55 60GCG CGG ATG GTG CGG GGC GCC TAC GTG GGC CAG CTG TAC CCC GGG CGG 479Ala Arg Met Val Arg Gly Ala Tyr Val Gly Gln Leu Tyr Pro Gly Arg65 70 75ACC GAG GCA GAC GCG GAT CGG CGT CAG GGC CTG GAG CTG CGG TTT GAT 527Thr Glu Ala Asp Ala Asp Arg Arg Gln Gly Leu Glu Leu Arg Phe Asp80 85 90 95GAG ACA GGG CAG CTG GTG GTG GAG CGG GCG ACG ACG TGG ACC AGG GAG 575Glu Thr Gly Gln Leu Val Val Glu Arg Ala Thr Thr Trp Thr Arg Glu100 105 110CCG ACA CTG GTG CGG CTG CAC TCG GAG TGT TAC ACG GGC GAG ACG GCG 623Pro Thr Leu Val Arg Leu His Ser Glu Cys Tyr Thr Gly Glu Thr Ala115 120 125TGG AGC GCG CGG TGC GAC TGC GGG GAG CAG TTC GAC CAG GCG GGT AAG 671Trp Ser Ala Arg Cys Asp Cys Gly Glu Gln Phe Asp Gln Ala Gly Lys130 135 140CTG ATG GCT GCG GCG ACA GAG GGC GAG GTG GTT GGC GGT GCG GGG CAC 719Leu Met Ala Ala Ala Thr Glu Gly Glu Val Val Gly Gly Ala Gly His145 150 155GGC GTG ATC GTG TAC CTG CGG CAG GAG GGC CGC GGC ATC GGG CTA GGC 767Gly Val Ile Val Tyr Leu Arg Gln Glu Gly Arg Gly Ile Gly Leu Gly160 165 170 175GAG AAG CTG AAG GCG TAC AAC CTG CAG GAC CTG GGC GCG GAC ACG GTG 815Glu Lys Leu Lys Ala Tyr Asn Leu Gln Asp Leu Gly Ala Asp Thr Val180 185 190CAG GCG AAC GAG CTG CTC AAC CAC CCT GCG GAC GCG CGC GAC TTC TCG 863Gln Ala Asn Glu Leu Leu Asn His Pro Ala Asp Ala Arg Asp Phe Ser195 200 205TTG GGG CGC GCA ATC CTA CTG GAC CTC GGT ATC GAG GAC ATC CGG TTG 911Leu Gly Arg Ala Ile Leu Leu Asp Leu Gly Ile Glu Asp Ile Arg Leu210 215 220CTC ACG AAT AAC CCC GAC AAG GTG CAG CAG GTG CAC TGT CCG CCG GCG 959Leu Thr Asn Asn Pro Asp Lys Val Gln Gln Val His Cys Pro Pro Ala225 230 235CTA CGC TGC ATC GAG CGG GTG CCC ATG GTG CCG CTT TCA TGG ACT CAG 1007Leu Arg Cys Ile Glu Arg Val Pro Met Val Pro Leu Ser Trp Thr Gln240 245 250 255CCC ACA CAG GGC GTG CGC TCG CGC GAG CTG GAC GGC TAC CTG CGC GCC 1055Pro Thr Gln Gly Val Arg Ser Arg Glu Leu Asp Gly Tyr Leu Arg Ala260 265 270AAG GTC GAG CGC ATG GGG CAC ATG CTG CAG CGG CCG CTG GTG CTG CAC 1103Lys Val Glu Arg Met Gly His Met Leu Gln Arg Pro Leu Val Leu His275 280 285ACG TCT GCG GCG GCC GAG CTC CCC CGC GCC AAC ACA CAC ATA TAATCTTTGC 1155Thr Ser Ala Ala Ala Glu Leu Pro Arg Ala Asn Thr His Ile290 295 300TATATTAAAA CTCTATAAAC GTATGCCACA CGGCGCCCGC GGGCTGCCAC ACGCTGCTCA1215CGGGCTGCCG AACAGTTCTA ACAAGTAATC GCGCGCCTCG CCAGTGATCG TGGCGAGCAC1275CTTGTCGTCC ATCATCACAT ATCCTCGGCT ACAGTCGTCG TTGAAGAGCG TGCA 1329(2)SEQ ID NO2的信息(i)序列特征(A)長度301個氨基酸(B)類型氨基酸(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型蛋白質(zhì)(xi)序列描述SEQ ID NO2Met Thr Glu Tyr Thr Val Pro Glu Val Arg Cys Val Ala Arg Ala Arg1 5 10 15Ile Pro Thr Val Gln Gly Thr Asp Val Phe Leu His Leu Tyr His Asn20 25 30Ser Ile Asp Ser Lys Glu His Leu Ala Ile Val Phe Gly Glu Asn Ile35 40 45Arg Ser Arg Ser Leu Phe Arg Tyr Arg Lys Asp Asp Thr Gln Gln Ala50 55 60Arg Met Val Arg Gly Ala Tyr Val Gly Gln Leu Tyr Pro Gly Arg Thr65 70 75 80Glu Ala Asp Ala Asp Arg Arg Gln Gly Leu Glu Leu Arg Phe Asp Glu85 90 95Thr Gly Gln Leu Val Val Glu Arg Ala Thr Thr Trp Thr Arg Glu Pro100 105 110Thr Leu Val Arg Leu His Ser Glu Cys Tyr Thr Gly Glu Thr Ala Trp115 120 125Ser Ala Arg Cys Asp Cys Gly Glu Gin Phe Asp Gln Ala Gly Lys Leu130 135 140Met Ala Ala Ala Thr Glu Gly Glu Val Val Gly Gly Ala Gly His Gly145 150 155 160Val Ile Val Tyr Leu Arg Gln Glu Gly Arg Gly Ile Gly Leu Gly Glu165 170 175Lys Leu Lys Ala Tyr Asn Leu Gln Asp Leu Gly Ala Asp Thr Val Gln180 185 190Ala Asn Glu Leu Leu Asn His Pro Ala Asp Ala Arg Asp Phe Ser Leu195 200 205Gly Arg Ala Ile Leu Leu Asp Leu Gly Ile Glu Asp Ile Arg Leu Leu210 215 220Thr Ash Asn Pro Asp Lys Val Gln Gln Val His Cys Pro Pro Ala Leu225 230 235 240Arg Cys Ile Glu Arg Val Pro Met Val Pro Leu Ser Trp Thr Gln Pro245 250 255Thr Gln Gly Val Arg Ser Arg Glu Leu Asp Gly Tyr Leu Arg Ala Lys260 265 270Val Glu Arg Met Gly His Met Leu Gln Arg Pro Leu Val Leu His Thr275 280 285Ser Ala Ala Ala Glu Leu Pro Arg Ala Asn Thr His Ile290 295 300(2)SEQ ID NO3的信息(i)序列特征(A)長度2627堿基對(B)類型核酸(C)鏈型雙鏈(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型cDNA到mRNA(iii)假擬否(iii)反義否(vi)原始來源(A)生物體棉阿舒囊霉(ix)特征(A)名稱/關(guān)鍵詞5’UTR(B)位置1..450(ix)特征(A)名稱/關(guān)鍵詞CDS(B)位置451..2280(ix)特征(A)名稱/關(guān)鍵詞3’UTR(B)位置2281..2627(xi)序列描述SEQ ID NO3CTGCAGGACA ATTTAAATTA CGATTACACG CGGCAGCCTT CTTGGTGCGA CAGGATTTTG 60TACAAGAATG ACCCCAAGCG GGTAAGAGTT CATAGGTATG CCTCGATTGA TAGACGTTCC 120ATTTTGAATT ATACTGATCA CGAACCCGTA ACGCTCGATG TCAGCGTTTC ATGCCATACA 180CAATTTGTCC CAATGGCTAT GCAGAATATT TCCCCACAGA GCACCATGGA AATGTATGTG 240GGAGACGTCA CAGATATACT ACTGATGTTG TTCTCCAGAG TATACTACGC CCCTACCATA 300TTCGATCTTG TGGTATTGAC GATATTCCTC TGTTTGGTTT TACTGGCACT ATTCCGTTTG 360ACGGTATAGC GCTATTCGTT CATAGTGACA CATGCGGCAC TAGCTATTCA GCGAATCCTT 420TATAAACTGC TACTTAACGT TCGTAACACC ATG CTC AAA GGC GTT CCT GGC CTT 474Met Leu Lys Gly Val Pro Gly Leu1 5CTT TTT AAG GAG ACG CAA CGT CAT CTG AAA CCC AGG CTG GTT AGG ATT 522Leu Phe Lys Glu Thr Gln Arg His Leu Lys Pro Arg Leu Val Arg Ile10 15 20ATG GAA AAC ACA TCG CAG GAT GAG AGT CGC AAA AGA CAG GTC GCT TCG 570Met Glu Asn Thr Ser Gln Asp Glu Ser Arg Lys Arg Gln Val Ala Ser25 30 35 40AAC TTG AGC AGC GAT GCC GAT GAG GGC TCG CCG GCA GTT ACG AGG CCG 618Asn Leu Ser Ser Asp Ala Asp Glu Gly Ser Pro Ala Val Thr Arg Pro45 50 55GTT AAA ATC ACC AAA CGC CTC AGG AAG AAG AAC CTC GGG ACA GGC GAG 666Val Lys Ile Thr Lys Arg Leu Arg Lys Lys Asn Leu Gly Thr Gly Glu60 65 70CTA CGG GAC AAA GCA GGA TTC AAG TTG AAG GTG CAA GAC GTG AGC AAA 714Leu Arg Asp Lys Ala Gly Phe Lys Leu Lys Val Gln Asp Val Ser Lys75 80 85AAC CGT CAC AGA CAG GTC GAT CCG GAA TAC GAA GTC GTG GTA GAT GGC 762Asn Arg His Arg Gln Val Asp Pro Glu Tyr Glu Val Val Val Asp Gly90 95 100CCG ATG CGC AAG ATC AAA CCG TAT TTC TTC ACA TAC AAG ACT TTC TGC 810Pro Met Arg Lys Ile Lys Pro Tyr Phe Phe Thr Tyr Lys Thr Phe Cys105 110 115 120AAG GAG CGC TGG AGA GAT CGG AAG TTG CTT GAT GTG TTT GTG GAT GAA 858Lys Glu Arg Trp Arg Asp Arg Lys Leu Leu Asp Val Phe Val Asp Glu125 130 135TTT CGG GAC CGC GAT AGG CCT TAC TAC GAG AAA GTC ATC GGT TCG GGT 906Phe Arg Asp Arg Asp Arg Pro Tyr Tyr Glu Lys Val Ile Gly Ser Gly140 145 150GGT GTG CTC CTG AAC GGT AAG TCA TCG ACG TTA GAT AGC GTA TTG CGT 954Gly Val Leu Leu Asn Gly Lys Ser Ser Thr Leu Asp Ser Val Leu Arg155 160 165AAT GGA GAC CTC ATT TCG CAC GAG CTG CAC CGT CAT GAG CCA CCG GTC 1002Asn Gly Asp Leu Ile Ser His Glu Leu His Arg His Glu Pro Pro Val170 175 180TCC TCT AGG CCG ATT AGG ACG GTG TAC GAA GAT GAT GAC ATC CTG GTG 1050Ser Ser Arg Pro Ile Arg Thr Val Tyr Glu Asp Asp Asp Ile Leu Val185 190 195 200ATT GAC AAG CCC AGC GGG ATT CCA GCC CAT CCC ACC GGG CGT TAC CGC 1098Ile Asp Lys Pro Ser Gly Ile Pro Ala His Pro Thr Gly Arg Tyr Arg205 210 215TTC AAC TCC ATT ACG AAA ATA CTT GAA AAA CAG CTT GGA TAC ACT GTT 1146Phe Asn Ser Ile Thr Lys Ile Leu Glu Lys Gln Leu Gly Tyr Thr Val220 225 230CAT CCA TGT AAC CGA CTG GAC CGC CTA ACC AGT GGC CTA ATG TTC TTG 1194His Pro Cys Asn Arg Leu Asp Arg Leu Thr Ser Gly Leu Met Phe Leu235 240 245GCA AAA ACT CCA AAG GGA GCC GAT GAG ATG GGT GAT CAG ATG AAG GCG 1242Ala Lys Thr Pro Lys Gly Ala Asp Glu Met Gly Asp Gln Met Lys Ala250 255 260CGC GAA GTG AAG AAA GAA TAT GTT GCC CGG GTT GTT GGG GAA TTT CCT 1290Arg Glu Val Lys Lys Glu Tyr Val Ala Arg Val Val Gly Glu Phe Pro265 270 275 280ATA GGT GAG ATA GTT GTG GAT ATG CCA CTG AAG ACT ATA GAG CCG AAG 1338Ile Gly Glu Ile Val Val Asp Met Pro Leu Lys Thr Ile Glu Pro Lys285 290 295CTT GCC CTA AAC ATG GTT TGC GAC CCG GAA GAC GAA GCG GGC AAG GGC 1386Leu Ala Leu Asn Met Val Cys Asp Pro Glu Asp Glu Ala Gly Lys Gly300 305 310GCT AAG ACG CAG TTC AAA AGA ATC AGC TAC GAT GGA CAA ACG AGC ATA 1434Ala Lys Thr Gln Phe Lys Arg Ile Ser Tyr Asp Gly Gln Thr Ser Ile315 320 325GTC AAG TGC CAA CCG TAC ACG GGC CGG ACG CAT CAG ATC CGT GTT CAC 1482Val Lys Cys Gln Pro Tyr Thr Gly Arg Thr His Gln Ile Arg Val His330 335 340TTG CAA TAC CTG GGC TTC CCA ATT GCC AAC GAT CCG ATT TAT TCC AAT 1530Leu Gln Tyr Leu Gly Phe Pro Ile Ala Asn Asp Pro Ile Tyr Ser Asn345 350 355 360CCG CAC ATA TGG GGC CCA AGT CTG GGC AAG GAA TGC AAA GCA GAC TAC 1578Pro His Ile Trp Gly Pro Ser Leu Gly Lys Glu Cys Lys Ala Asp Tyr365 370 375AAG GAG GTC ATC CAA AAA CTA AAC GAA ATT GGT AAG ACT AAA TCT GCG 1626Lys Glu Val Ile Gln Lys Leu Asn Glu Ile Gly Lys Thr Lys Ser Ala380 385 390GAA AGT TGG TAC CAT TCT GAT TCC CAA GGT GAA GTT TTC AAA GGG GAA 1674Glu Ser Trp Tyr His Ser Asp Ser Gln Gly Glu Val Phe Lys Gly Glu395 400 405CAA TGC GAT GAA TGT GGC ACC GAA CTG TAC ACT GAC CCG GGC CCG AAT 1722Gln Cys Asp Glu Cys Gly Thr Glu Leu Tyr Thr Asp Pro Gly Pro Asn410 415 420GAT CTT GAC TTA TGG TTG CAT GCA TAT CGG TAT GAA TCC ACT GAA CTG 1770Asp Leu Asp Leu Trp Leu His Ala Tyr Arg Tyr Glu Ser Thr Glu Leu425 430 435 440GAT GAG AAC GGT GCT AAA AAG CGG AGT TAC TCT ACT GCG TTT CCT GAG 1818Asp Glu Asn Gly Ala Lys Lys Arg Ser Tyr Ser Thr Ala Phe Pro Glu445 450 455TGG GCT CTT GAG CAG CAC GGC GAC TTC ATG CGG CTT GCC ATC GAA CAG 1866Trp Ala Leu Glu Gln His Gly Asp Phe Met Arg Leu Ala Ile Glu Gln460 465 470GCT AAG AAA TGC CCA CCC GCG AAG ACA TCA TTT AGC GTT GGT GCC GTG 1914Ala Lys Lys Cys Pro Pro Ala Lys Thr Ser Phe Ser Val Gly Ala Val475 480 485TTA GTT AAT GGG ACC GAG ATT TTG GCC ACT GGT TAC TCA CGG GAG CTG 1962Leu Val Asn Gly Thr Glu Ile Leu Ala Thr Gly Tyr Ser Arg Glu Leu490 495 500GAA GGC AAC ACG CAC GCT GAA CAA TGT GCA CTT CAA AAA TAT TTT GAA 2010Glu Gly Asn Thr His Ala Glu Gln Cys Ala Leu Gln Lys Tyr Phe Glu505 510 515 520CAA CAT AAA ACC GAC AAG GTT CCT ATT GGT ACA GTA ATA TAC ACG ACT 2058Gln His Lys Thr Asp Lys Val Pro Ile Gly Thr Val Ile Tyr Thr Thr525 530 535ATG GAG CCT TGT TCT CTC CGT CTC AGT GGT AAT AAA CCG TGT GTT GAG 2106Met Glu Pro Cys Ser Leu Arg Leu Ser Gly Asn Lys Pro Cys Val Glu540 545 550CGT ATA ATC TGC CAG CAG GGT AAT ATT ACT GCT GTT TTT GTT GGC GTA 2154Arg Ile Ile Cys Gln Gln Gly Asn Ile Thr Ala Val Phe Val Gly Val555 560 565CTT GAG CCA GAC AAC TTC GTG AAG AAC AAT ACA AGT CGT GCG CTA TTG 2202Leu Glu Pro Asp Asn Phe Val Lys Asn Asn Thr Ser Arg Ala Leu Leu570 575 580GAA CAA CAT GGT ATA GAC TAT ATT CTT GTC CCT GGG TTT CAA GAA GAA 2250Glu Gln His Gly Ile Asp Tyr Ile Leu Val Pro Gly Phe Gln Glu Glu585 590 595 600TGT ACT GAA GCC GCA TTG AAG GGT CAT TGATTTTGCT GCGAATTGTA 2297Cys Thr Glu Ala Ala Leu Lys Gly His605 610GATGACTTAA AATATCGAGG CGTATAATTC GTCGCATTTT ATATAGTTAT CTATGTTTAC 2357ATGACTGTTT AAGCTTGATC TATATTTCTC AAGTGAATTG CCACATATGT TGGTACGGTA 2417ATAAATTAAT GAGGGAGTTT TGAAATTCGC AACCAATCTT ATATACGTTT GATGATATAA 2477ACGGATTGAG ATTCATTAAG CTACCTGATT TTCGCTGAAC TGTTTGTTAT AGGTTTTTAC 2537AGTAAGATAG TTCCTAAGTT TGTTTATTGT CCCCAGTCGG CCAATTGTTC CGGACTTATT 2597ATTATTACCA TTAGTGGTGT TAGTAGTATT 2627(2)SEQ ID NO4的信息(i)序列特征(A)長度609個氨基酸(B)類型氨基酸(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型蛋白質(zhì)(xi)序列描述SEQ ID NO4Met Leu Lys Gly Val Pro Gly Leu Leu Phe Lys Glu Thr Gln Arg His1 5 10 15Leu Lys Pro Arg Leu Val Arg Ile Met Glu Asn Thr Ser Gln Asp Glu20 25 30Ser Arg Lys Arg Gln Val Ala Ser Asn Leu Ser Ser Asp Ala Asp Glu35 40 45Gly Ser Pro Ala Val Thr Arg Pro Val Lys Ile Thr Lys Arg Leu Arg50 55 60Lys Lys Asn Leu Gly Thr Gly Glu Leu Arg Asp Lys Ala Gly Phe Lys65 70 75 80Leu Lys Val Gln Asp Val Ser Lys Asn Arg His Arg Gln Val Asp Pro85 90 95Glu Tyr Glu Val Val Val Asp Gly Pro Met Arg Lys Ile Lys Pro Tyr100 105 110Phe Phe Thr Tyr Lys Thr Phe Cys Lys Glu Arg Trp Arg Asp Arg Lys115 120 125Leu Leu AsP Val Phe Val Asp Glu Phe Arg Asp Arg Asp Arg Pro Tyr130 135 140Tyr Glu Lys Val Ile Gly Ser Gly Gly Val Leu Leu Asn Gly Lys Ser145 150 155 160Ser Thr Leu Asp Ser Val Leu Arg Asn Gly Asp Leu Ile Ser His Glu165 170 175Leu His Arg His Glu Pro Pro Val Ser Ser Arg Pro Ile Arg Thr Val180 185 190Tyr Glu Asp Asp Asp Ile Leu Val Ile Asp Lys Pro Ser Gly Ile Pro195 200 205Ala His Pro Thr Gly Arg Tyr Arg Phe Asn Ser Ile Thr Lys Ile Leu210 215 220Glu Lys Gln Leu Gly Tyr Thr Val His Pro Cys Asn Arg Leu Asp Arg225 230 235 240Leu Thr Ser Gly Leu Met Phe Leu Ala Lys Thr Pro Lys Gly Ala Asp245 250 255Glu Met Gly Asp Gln Met Lys Ala Arg Glu Val Lys Lys Glu Tyr Val260 265 270Ala Arg Val Val Gly Glu Phe Pro Ile Gly Glu Ile Val Val Asp Met275 280 285Pro Leu Lys Thr Ile Glu Pro Lys Leu Ala Leu Asn Met Val Cys Asp290 295 300Pro Glu Asp Glu Ala Gly Lys Gly Ala Lys Thr Gln Phe Lys Arg Ile305 310 315 320Ser Tyr Asp Gly Gln Thr Ser Ile Val Lys Cys Gln Pro Tyr Thr Gly325 330 335Arg Thr His Gln Ile Arg Val His Leu Gln Tyr Leu Gly Phe Pro Ile340 345 350Ala Ash Asp Pro Ile Tyr Ser Asn Pro His Ile Trp Gly Pro Ser Leu355 360 365Gly Lys Glu Cys Lys Ala Asp Tyr Lys Glu Val Ile Gln Lys Leu Asn370 375 380Glu Ile Gly Lys Thr Lys Ser Ala Glu Ser Trp Tyr His Ser Asp Ser385 390 395 400Gln Gly Glu Val Phe Lys Gly Glu Gln Cys Asp Glu Cys Gly Thr Glu405 410 415Leu Tyr Thr Asp Pro Gly Pro Asn Asp Leu Asp Leu Trp Leu His Ala420 425 430Tyr Arg Tyr Glu Ser Thr Glu Leu Asp Glu Asn Gly Ala Lys Lys Arg435 440 445Ser Tyr Ser Thr Ala Phe Pro Glu Trp Ala Leu Glu Gln His Gly Asp450 455 460Phe Met Arg Leu Ala Ile Glu Gln Ala Lys Lys Cys Pro Pro Ala Lys465 470 475 480Thr Ser Phe Ser Val Gly Ala Val Leu Val Asn Gly Thr Glu Ile Leu485 490 495Ala Thr Gly Tyr Ser Arg Glu Leu Glu Gly Asn Thr His Ala Glu Gln500 505 510Cys Ala Leu Gln Lys Tyr Phe Glu Gln His Lys Thr Asp Lys Val Pro515 520 525Ile Gly Thr Val Ile Tyr Thr Thr Met Glu Pro Cys Ser Leu Arg Leu530 535 540Ser Gly Asn Lys Pro Cys Val Glu Arg Ile Ile Cys Gln Gln Gly Asn545 550 555 560Ile Thr Ala Val Phe Val Gly Val Leu Glu Pro Asp Asn Phe Val Lys565 570 575Asn Asn Thr Ser Arg Ala Leu Leu Glu Gln His Gly Ile Asp Tyr Ile580 585 590Leu Val Pro Gly Phe Gln Glu Glu Cys Thr Glu Ala Ala Leu Lys Gly595 600 605His(2)SEQ ID NO5的信息(i)序列特征(A)長度1082堿基對(B)類型核酸(C)鏈型雙鏈(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型cDNA到mRNA(iii)假擬否(iii)反義否(vi)原始來源(A)生物體棉阿舒囊霉(ix)特征(A)名稱/關(guān)鍵詞5’UTR(B)位置1..314(ix)特征(A)名稱/關(guān)鍵詞CDS(B)位置315..953(ix)特征(A)名稱/關(guān)鍵詞3’UTR(B)位置954..1082(xi)序列描述SEQ ID NO5CCCTTCTTGC ACGGTCGTTT CTGAAACTCT ACGATTATTG GAACAATGAG TAAGTCCTCA 60AATGTACCAC CTATCTGTAG TTTACTATCG GATTTACTGG CTAAGAGCTG ACCTGTTAGG 120CAAGTGAAAC ATATCACATC GCCAGCAGGT TGGGCTACCA AGGATAGTTG ATGACTTCCA 180TCACCTATAA AAGCGGCTTG AGTGCTTTTG CAATGATTCT GTTCACATGA TGGACAAGAA 240ATACGTACAA AAATTTCAAC GTTTTACAAG TTCCCAAGCT TAGTCAACTC ATCACCAACG 300ACAAACCAAG CAAC ATG ACA AGC CCA TGC ACT GAT ATC GGT ACC GCT ATA350Met Thr Set Pro Cys Thr Asp Ile Gly Thr Ala Ile1 5 10GAG CAG TTC AAG CAA AAT AAG ATG ATC ATC GTC ATG GAC CAC ATC TCG398Glu Gln Phe Lys Gln Asn Lys Met Ile Ile Val Met Asp His Ile Ser15 20 25AGA GAA AAC GAG GCC GAT CTA ATA TGT GCA GCA GCG CAC ATG ACT GCC446Arg Glu Asn Glu Ala Asp Leu Ile Cys Ala Ala Ala His Met Thr Ala30 35 40GAG CAA ATG GCA TTT ATG ATT CGG TAT TCC TCG GGC TAC GTT TGC GCT494Glu Gln Met Ala Phe Met Ile Arg Tyr Ser Ser Gly Tyr Val Cys Ala45 50 55 60CCA ATG ACC AAT GCG ATT GCC GAT AAG CTA GAC CTA CCG CTC ATG AAC542Pro Met Thr Asn Ala Ile Ala Asp Lys Leu Asp Leu Pro Leu Met Asn65 70 75ACA TTG AAA TGC AAG GCT TTC TCC GAT GAC AGA CAC AGC ACT GCG TAT590Thr Leu Lys Cys Lys Ala Phe Ser Asp Asp Arg His Ser Thr Ala Tyr80 85 90ACA ATC ACC TGT GAC TAT GCG CAC GGG ACG ACG ACA GGT ATC TCC GCA638Thr Ile Thr Cys Asp Tyr Ala His Gly Thr Thr Thr Gly Ile Ser Ala95 100 105CGT GAC CGG GCG TTG ACC GTG AAT CAG TTG GCG AAC CCG GAG TCC AAG686Arg Asp Arg Ala Leu Thr Val Asn Gln Leu Ala Asn Pro Glu Ser Lys110 115 120GCT ACC GAC TTC ACG AAG CCA GGC CAC ATT GTG CCA TTG CGT GCC CGT734Ala Thr Asp Phe Thr Lys Pro Gly His Ile Val Pro Leu Arg Ala Arg125 130 135 140GAC GGC GGC GTG CTC GAG CGT GAC GGG CAC ACC GAA GCG GCG CTC GAC782Asp Gly Gly Val Leu Glu Arg Asp Gly His Thr Glu Ala Ala Leu Asp145 150 155TTG TGC AGA CTA GCG GGT GTG CCA GAG GTC GCT GCT ATT TGT GAA TTA830Leu Cys Arg Leu Ala Gly Val Pro Glu Val Ala Ala Ile Cys Glu Leu160 165 170GTA AGC GAA AGG GAC GTC GGG CTG ATG ATG ACT TTG GAT GAG TGT ATA878Val Ser Glu Arg Asp Val Gly Leu Met Met Thr Leu Asp Glu Cys Ile175 180 185GAA TTC AGC AAG AAG CAC GGT CTT GCC CTC ATC ACC GTG CAT GAC CTG926Glu Phe Ser Lys Lys His Gly Leu Ala Leu Ile Thr Val His Asp Leu190 195 200AAG GCT GCA GTT GCC GCC AAG CAG TAGACGGCAA CGAGTTCTTT AAGTCGGTGT 980Lys Ala Ala Val Ala Ala Lys Gln205 210TCATTTATGT AATATACCAT TTCATCGAAA AAGTCAAATG GTATGAACTA GATTTATCAA 1040TAGTATCTAA GAGTTATGGT ATTCGCAAAA GCTTATCGAT AC1082(2)SEQ ID NO6的信息(i)序列特征(A)長度212個氨基酸(B)類型氨基酸(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型蛋白質(zhì)(xi)序列描述SEQ ID NO6Met Thr Ser Pro Cys Thr Asp Ile Gly Thr Ala Ile Glu Gln Phe Lys1 5 10 15Gln Asn Lys Met Ile Ile Val Met Asp His Ile Ser Arg Glu Asn Glu20 25 30Ala Asp Leu Ile Cys Ala Ala Ala His Met Thr Ala Glu Gln Met Ala35 40 45Phe Met Ile Arg Tyr Ser Ser Gly Tyr Val Cys Ala Pro Met Thr Asn50 55 60Ala Ile Ala Asp Lys Leu Asp Leu Pro Leu Met Asn Thr Leu Lys Cys65 70 75 80Lys Ala Phe Ser Asp Asp Arg His Ser Thr Ala Tyr Thr Ile Thr Cys85 90 95Asp Tyr Ala His Gly Thr Thr Thr Gly Ile Ser Ala Arg Asp Arg Ala100 105 110Leu Thr Val Asn Gln Leu Ala Asn Pro Glu Ser Lys Ala Thr Asp Phe115 120 125Thr Lys Pro Gly His Ile Val Pro Leu Arg Ala Arg Asp Gly Gly Val130 135 140Leu Glu Arg Asp Gly His Thr Glu Ala Ala Leu Asp Leu Cys Arg Leu145 150 155 160Ala Gly Val Pro Glu Val Ala Ala Ile Cys Glu Leu Val Ser Glu Arg165 170 175Asp Val Gly Leu Met Met Thr Leu Asp Glu Cys Ile Glu Phe Ser Lys180 185 190Lys His Gly Leu Ala Leu Ile Thr Val His Asp Leu Lys Ala Ala Val195 200 205Ala Ala Lys Gln210(2)SEQ ID NO7的信息(i)序列特征(A)長度996堿基對(B)類型核酸(C)鏈型雙鏈(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型cDNA到mRNA(iii)假擬否(iii)反義否(vi)原始來源(A)生物體棉阿舒囊霉(ix)特征(A)名稱/關(guān)鍵詞5’UTR(B)位置1..270(ix)特征(A)名稱/關(guān)鍵詞CDS(B)位置271..789(ix)特征(A)名稱/關(guān)鍵詞3’UTR(B)位置790..996(xi)序列描述SEQ ID NO7TGGTATAATG ATACAGGAAG TGAAAATCCG AAAGGTTCAG ACGATGAAAA GAGTTTGAGA60CGCATCAATG ATCAGCTTTG AGCTATATGT AAGTCTATTA ATTGATTACT AATAGCAATT 120TATGGTATCC TCTGTTCTGC ATATCGACGG TTCTCACGTG ATGATCAGCT TGAGGCTTCG 180CGGATAAAGT TCCATCGATT ACTATAAAAC CATCACATTA AACGTTCACT ATAGGCATAC 240ACACAGACTA AGTTCAAGTT AGCAGTGACA ATG ATT AAG GGA TTA GGC GAA GTT294Met Ile Lys Gly Leu Gly Glu Val1 5GAT CAA ACC TAC GAT GCG AGC TCT GTC GAG GTT GGC ATT GTC CAC GCG 342Asp Gln Thr Tyr Asp Ala Ser Ser Val Glu Val Gly Ile Val His Ala10 15 20AGA TGG AAC AAG ACT GTC ATT GAC GCT CTC GAC CAA GGT GCA ATT GAG 390Arg Trp Asn Lys Thr Val Ile Asp Ala Leu Asp Gln Gly Ala Ile Glu25 30 35 40AAA CTG CTT GCT ATG GGA GTG AAG GAG AAG AAT ATC ACT GTA AGC ACC 438Lys Leu Leu Ala Met Gly Val Lys Glu Lys Asn Ile Thr Val Ser Thr45 50 55GTT CCA GGT GCG TTT GAA CTA CCA TTT GGC ACT CAG CGG TTT GCC GAG 486Val Pro Gly Ala Phe Glu Leu Pro Phe Gly Thr Gln Arg Phe Ala Glu60 65 70CTG ACC AAG GCA AGT GGC AAG CAT TTG GAC GTG GTC ATC CCA ATT GGA 534Leu Thr Lys Ala Ser Gly Lys His Leu Asp Val Val Ile Pro Ile Gly75 80 85GTC CTG ATC AAA GGC GAC TCA ATG CAC TTT GAA TAT ATA TCA GAC TCT 582Val Leu Ile Lys Gly Asp Ser Met His Phe Glu Tyr Ile Ser Asp Ser90 95 100GTG ACT CAT GCC TTA ATG AAC CTA CAG AAG AAG ATT CGT CTT CCT GTC 630Val Thr His Ala Leu Met Asn Leu Gln Lys Lys Ile Arg Leu Pro Val105 110 115 120ATT TTT GGT TTG CTA ACG TGT CTA ACA GAG GAA CAA GCG TTG ACA CGT 678Ile Phe Gly Leu Leu Thr Cys Leu Thr Glu Glu Gln Ala Leu Thr Arg125 130 135GCA GGC CTC GGT GAA TCT GAA GGC AAG CAC AAC CAC GGT GAA GAC TGG 726Ala Gly Leu Gly Glu Ser Glu Gly Lys His Asn His Gly Glu Asp Trp140 145 150GGT GCT GCT GCC GTG GAG ATG GCT GTA AAG TTT GGC CCA CGC GCC GAA 774Gly Ala Ala Ala Val Glu Met Ala Val Lys Phe Gly Pro Arg Ala Glu155 160 165CAA ATG AAG AAG TGAATATTAA AAAATCACTA CTTAAAATTA ACGTTTTTAT 826Gln Met Lys Lys170TATGTCTATA TCAAATTCTT ACGTGATAAC TTTTGATTTC GCTTCCTGGA TTGGCGCAAG 886GCCTCCCTGT GTCGCAGTTT TTGTTCACGG GTCCACACAG CTCTGTTTTC CCAGAACATA 946TCCTCCCAGC CGGCGAACCG GTTAGACGCT TCTGCTGGCG TTCTTATTTT 996(2)SEQ ID NO8的信息(i)序列特征(A)長度172個氨基酸(B)類型氨基酸(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型蛋白質(zhì)(xi)序列描述SEQ ID NO8Met Ile Lys Gly Leu Gly Glu Val Asp Gln Thr Tyr Asp Ala Ser Ser1 5 10 15Val Glu Val Gly Ile Val His Ala Arg Trp Asn Lys Thr Val Ile Asp20 25 30Ala Leu Asp Gln Gly Ala Ile Glu Lys Leu Leu Ala Met Gly Val Lys35 40 45Glu Lys Asn Ile Thr Val Ser Thr Val Pro Gly Ala Phe Glu Leu Pro50 55 60Phe Gly Thr Gln Arg Phe Ala Glu Leu Thr Lys Ala Ser Gly Lys His65 70 75 80Leu Asp Val Val Ile Pro Ile Gly Val Leu Ile Lys Gly Asp Ser Met85 90 95His Phe Glu Tyr Ile Ser Asp Ser Val Thr His Ala Leu Met Asn Leu100 105 110Gln Lys Lys Ile Arg Leu Pro Val Ile Phe Gly Leu Leu Thr Cys Leu115 120 125Thr Glu Glu Gln Ala Leu Thr Arg Ala Gly Leu Gly Glu Ser Glu Gly130 135 140Lys His Asn His Gly Glu Asp Trp Gly Ala Ala Ala Val Glu Met Ala145 150 155 160Val Lys Phe Gly Pro Arg Ala Glu Gln Met Lys Lys165 170(2)SEQ ID NO9的信息(i)序列特征(A)長度1511堿基對(B)類型核酸(C)鏈型雙鏈(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型cDNA到mRNA(iii)假擬否(iii)反義否(vi)原始來源(A)生物體棉阿舒囊霉(ix)特征(A)名稱/關(guān)鍵詞5’UTR(B)位置1..524(ix)特征(A)名稱/關(guān)鍵詞CDS(B)位置525..1232(ix)特征(A)名稱/關(guān)鍵詞3’UTR(B)位置1233..1511(xi)序列描述SEQ ID NO9TGTATTCAAC CTGGAGGATA ACGAAATTTC CATGGCGCGG GCGATACCAA CCCACAGGAG60CCAGATATAA GACCAATCCC GGCGGGTGTG CCAGCCGCCA TCAGAGACAG CGGGCCAGCA 120AGGCATGTGA AGTCAAAAGG CGCCAGCTCC TTATCCGCTC CCGCACAAGC AGGACCGGCA 180TATCCCGATG AGCGCGCCAG CACCCAGACG CTACACCACC ATTCGAAGTA GACTTTAAAA 240GAGCGCTTTC CAGCTTCTCA GGCAGTTAGC TCTACGACAA AGGAACCAAG TGATTTTCCC 300GATAGACGCG ACTTGCTCAA CGATGTTTCT GTGACCAGCG CAAGGAGAGA TAGTCCTAAA 360GTATAATCAG ATAGTTAGTC GTATCTTCTA GTTTTATTAG TCAGCTACAT GGCGAACCGC 420CATTTCCTTA TGCATGTCTT ACGAGTTTAA AAAGCTCGCG GTAGCAGAAA AGAAGATGCA 480TAGATGGCAT ACCGAAGCCT ATATCGCCCA TAGAAGTTGA TAGG ATG TTT ACC GGT536Met Phe Thr Gly1ATA GTG GAA CAC ATT GGC ACT GTT GCT GAG TAC TTG GAG AAC GAT GCC 584Ile Val Glu His Ile Gly Thr Val Ala Glu Tyr Leu Glu Asn Asp Ala5 10 15 20AGC GAG GCA GGC GGC AAC GGT GTG TCA GTC CTT ATC AAG GAT GCG GCT 632Ser Glu Ala Gly Gly Asn Gly Val Ser Val Leu Ile Lys Asp Ala Ala25 30 35CCG ATA CTG GCG GAT TGC CAC ATC GGT GAC TCG ATT GCA TGC AAT GGT 680Pro Ile Leu Ala Asp Cys His Ile Gly Asp Ser Ile Ala Cys Asn Gly40 45 50ATC TGC CTG ACG GTG ACG GAG TTC ACG GCC GAT AGC TTC AAG GTC GGG 728Ile Cys Leu Thr Val Thr Glu Phe Thr Ala Asp Ser Phe Lys Val Gly55 60 65ATC GCA CCA GAA ACA GTT TAT CGG ACG GAA GTC AGC AGC TGG AAA GCT 776Ile Ala Pro Glu Thr Val Tyr Arg Thr Glu Val Ser Ser Trp Lys Ala70 75 80GGC TCC AAG ATC AAC CTA GAA AGG GCC ATC TCG GAC GAC AGG CGC TAC 824Gly Ser Lys Ile Asn Leu Glu Arg Ala Ile Ser Asp Asp Arg Arg Tyr85 90 95 100GGC GGG CAC TAC GTG CAG GGC CAC GTC GAC TCG GTG GCC TCT ATT GTA 872Gly Gly His Tyr Val Gln Gly His Val Asp Ser Val Ala Ser Ile Val105 110 115TCC AGA GAG CAC GAC GGG AAC TCT ATC AAC TTT AAG TTT AAA CTG CGC 920Ser Arg Glu His Asp Gly Asn Ser Ile Asn Phe Lys Phe Lys Leu Arg120 125 130GAT CAA GAG TAC GAG AAG TAC GTA GTA GAA AAG GGT TTT GTG GCG ATC 968Asp Gln Glu Tyr Glu Lys Tyr Val Val Glu Lys Gly Phe Val Ala Ile135 140 145GAC GGT GTG TCG CTG ACT GTA AGC AAG ATG GAT CCA GAT GGC TGT TTC1016Asp Gly Val Ser Leu Thr Val Ser Lys Met Asp Pro Asp Gly Cys Phe150 155 160TAC ATC TCG ATG ATT GCA CAC ACG CAG ACC GCT GTA GCC CTT CCA CTG 1064Tyr Ile Ser Met Ile Ala His Thr Gln Thr Ala Val Ala Leu Pro Leu165 170 175 180AAG CCG GAC GGT GCC CTC GTG AAC ATA GAA ACG GAT GTT AAC GGC AAG 1112Lys Pro Asp Gly Ala Leu Val Asn Ile Glu Thr Asp Val Asn Gly Lys185 190 195CTA GTA GAG AAG CAG GTT GCA CAG TAC CTG AAT GCG CAG CTG GAA GGT 1160Leu Val Glu Lys Gln Val Ala Gln Tyr Leu Asn Ala Gln Leu Glu Gly200 205 210GAG AGC TCG CCA TTG CAG CGC GTG CTC GAA AGG ATT ATT GAA TCC AAG 1208Glu Ser Ser Pro Leu Gln Arg Val Leu Glu Arg Ile Ile Glu Ser Lys215 220 225CTT GCT AGC ATC TCA AAT AAG TGATTATATT ATCTTGGGTG CTGTATATCT 1259Leu Ala Ser Ile Ser Asn Lys230 235TATGTATGTC TTACGACTGT GAATCAGAGG GGTGGCAGCT GGAACACCAG CGACACACCT 1319TCGTCTCCCG CGGTGATCAG CCTTCTGTTT TCCTCAAGTA GTACAAAGTC TAGGACACCC 1379TGTTGTGGCC AACGCAAACA TGGAGCTGCT GCCCGTTACG CACGTCGAAC TCGTAGACCT 1439TGCCGTCAAT GCACGAGGCG AACAGGTGGA AACCGGTGGT CTTGTCAAAC CGCCAGCTTC 1499GTGACCGAGT CC 1511(2)SEQ ID NO10的信息(i)序列特征(A)長度235個氨基酸(B)類型氨基酸(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型蛋白質(zhì)(xi)序列描述SEQ ID NO10Met Phe Thr Gly Ile Val Glu His Ile Gly Thr Val Ala Glu Tyr Leu1 5 10 15Glu Asn Asp Ala Ser Glu Ala Gly Gly Asn Gly Val Ser Val Leu Ile20 25 30Lys Asp Ala Ala Pro Ile Leu Ala Asp Cys His Ile Gly Asp Ser Ile35 40 45Ala Cys Asn Gly Ile Cys Leu Thr Val Thr Glu Phe Thr Ala Asp Ser50 55 60Phe Lys Val Gly Ile Ala Pro Glu Thr Val Tyr Arg Thr Glu Val Ser65 70 75 80Ser Trp Lys Ala Gly Ser Lys Ile Asn Leu Glu Arg Ala Ile Ser Asp85 90 95Asp Arg Arg Tyr Gly Gly His Tyr Val Gln Gly His Val Asp Ser Val100 105 110Ala Ser Ile Val Ser Arg Glu His Asp Gly Asn Ser Ile Asn Phe Lys115 120 125Phe Lys Leu Arg Asp Gln Glu Tyr Glu Lys Tyr Val Val Glu Lys Gly130 135 140Phe Val Ala Ile Asp Gly Val Ser Leu Thr Val Ser Lys Met Asp Pro145 150 155 160Asp Gly Cvs Phe Tyr Ile Ser Met Ile Ala His Thr Gln Thr Ala Val165 170 175Ala Leu Pro Leu Lys Pro Asp Gly Ala Leu Val Asn Ile Glu Thr Asp180 185 190Val Asn Gly Lys Leu Val Glu Lys Gln Val Ala Gln Tyr Leu Asn Ala195 200 205Gln Leu Glu Gly Glu Ser Ser Pro Leu Gln Arg Val Leu Glu Arg Ile210 215 220Ile Glu Ser Lys Leu Ala Ser Ile Ser Asn Lys225 230 235(2)SEQ ID NO11的信息(i)序列特征(A)長度1596堿基對(B)類型核酸(C)鏈型雙鏈(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型cDNA到mRNA(iii)假擬否(iii)反義否(vi)原始來源(A)生物體棉阿舒囊霉(ix)特征(A)名稱/關(guān)鍵詞5’UTR
(B)位置1..352(ix)特征(A)名稱/關(guān)鍵詞CDS(B)位置353..1093(ix)特征(A)名稱/關(guān)鍵詞3’UTR(B)位置1094..1596(xi)序列描述SEQ ID NO11AGAAGAAGCG CAGGCGCCAG TCCGAGCTGG AGGAGAACGA GGCGGCGCGG TTGACGAACA60GCGCGCTGCC CATGGACGAT GCGGGTATAC AGACGGCGGG TATACAGACG GCGGGTGGTG 120CCGAGAGAGG CACCAGGCCG GCTTCCTCCA GCGATGCAAG GAAGAGAAGG GGACCAGAGG 180CGAAGTTCAA GCCATCTAAG GTACAGAAGC CCCAATTGAA GCGAACTGCA TCGTCCCGGG 240CGGATGAGAA CGAGTTCTCG ATATTATAGA GGCCCCCGTT TCGAGTGATT GGCGTCAAAA 300ACGGCTATCT GCCTTCGTCC GCCCCCACCA CCCTCGGGAA CACTGGCAAA CC ATG 355Met1GCG CTA ATA CCA CTT TCT CAA GAT CTG GCT GAT ATA CTA GCA CCG TAC 403Ala Leu Ile Pro Leu Ser Gln Asp Leu Ala Asp Ile Leu Ala Pro Tyr5 10 15TTA CCG ACA CCA CCG GAC TCA TCC GCA CGC CTG CCG TTT GTC ACG CTG 451Leu Pro Thr Pro Pro Asp Ser Ser Ala Arg Leu Pro Phe Val Thr Leu20 25 30ACG TAT GCG CAG TCC CTA GAT GCT CGT ATC GCG AAG CAA AAG GGT GAA 499Thr Tyr Ala Gln Ser Leu Asp Ala Arg Ile Ala Lys Gln Lys Gly Glu35 40 45AGG ACG GTT ATT TCG CAT GAG GAG ACC AAG ACA ATG ACG CAT TAT CTA 547Arg Thr Val Ile Ser His Glu Glu Thr Lys Thr Met Thr His Tyr Leu50 55 60 65CGC TAC CAT CAT AGC GGC ATC CTG ATT GGC TCG GGC ACA GCC CTT GCG 595Arg Tyr His His Ser Gly Ile Leu Ile Gly Ser Gly Thr Ala Leu Ala70 75 80GAC GAC CCG GAT CTC AAT TGC CGG TGG ACA CCT GCA GCG GAC GGG GCG643Asp Asp Pro Asp Leu Asn Cys Arg Trp Thr Pro Ala Ala Asp Gly Ala85 90 95GAT TGC ACC GAA CAG TCT TCA CCA CGA CCC ATT ATC TTG GAT GTT CGG691Asp Cys Thr Glu Gln Ser Ser Pro Arg Pro Ile Ile Leu Asp Val Arg100 105 110GGC AGA TGG AGA TAC CGC GGG TCC AAA ATA GAG TAT CTG CAT AAC CTT739Gly Arg Trp Arg Tyr Arg Gly Ser Lys Ile Glu Tyr Leu His Asn Leu115 120 125GGC AAG GGG AAG GCG CCC ATA GTG GTC ACG GGG GGT GAG CCG GAG GTC787Gly Lys Gly Lys Ala Pro Ile Val Val Thr Gly Gly Glu Pro Glu Val130 135 140 145CGC GAA CTA GGC GTC AGT TAC CTG CAG CTG GGT GTC GAC GAG GGT GGC835Arg Glu Leu Gly Val Ser Tyr Leu Gln Leu Gly Val Asp Glu Gly Gly150 155 160CGC TTG AAT TGG GGC GAG TTG TTT GAG CGA CTC TAT TCT GAG CAC CAC883Arg Leu Asn Trp Gly Glu Leu Phe Glu Arg Leu Tyr Ser Glu His His165 170 175CTG GAA AGT GTC ATG GTC GAA GGC GGC GCG GAG GTG CTC AAC CAG CTG931Leu Glu Ser Val Met Val Glu Gly Gly Ala Glu Val Leu Asn Gln Leu180 185 190CTG CTG CGC CCA GAT ATT GTG GAC AGT CTG GTG ATC ACG ATA GGA TCC979Leu Leu Arg Pro Asp Ile Val Asp Ser Leu Val Ile Thr Ile Gly Ser195 200 205AAG TTC CTG GGC TCA CTA GGT GTT GCG GTC TCA CCA GCT GAG GAG GTG 1027Lys Phe Leu Gly Ser Leu Gly Val Ala Val Ser Pro Ala Glu Glu Val210 215 220 225AAC CTA GAG CAT GTG AAC TGG TGG CAC GGA ACA AGT GAC AGT GTT TTG 1075Asn Leu Glu His Val Asn Trp Trp His Gly Thr Ser Asp Ser Val Leu230 235 240TGC GGC CGG CTC GCA TAGCGGTTAT GACTGGTCTA CTAGTTAAAA CTATTTACTC 1130Cys Gly Arg Leu Ala245CTATACATAT TGCGTCACAT AGCGTTTATC CCCCTCGCCA ACCGCCTCGT GCCGTTGGAA 1190ACACGGCGGC CGGGGGACCT CAAGCGCTCC GCATCGACTA GTTTAATTTA CAAACAGATT 1250CTGTAACTTG CGTAACGGCC AGAGGTCTCT GACTTTCTGA TAATCTTCAC CACCTCACCT 1310CGCTTCAACC CCAGGTATAA TGCAACTTGG ATCCATCCTC TGGATTCTAG GTAACTGAGA 1370TTCCTTTAAC CTGTATCTCT TCAACAACTC CTTCTTTTCT TCGTCGCTGA GTTTGATATG 1430TTTTGGCACA AGCTCATGGT GCGTGATATT TACCACCAAA GCTGTTTCGT TGAAAGTCTC 1490AATTGTAGCA GGAGCGACGG AGGGAAGCAG TTTCAACGCG CTGGGCGTTA TGCCGTTCTG 1550ATATATGAAA ATACCCGTCT GGAAGTTCTT CTCGCCAATG TGGATC1596(2)SEQ ID NO12的信息(i)序列特征(A)長度246個氨基酸(B)類型氨基酸(D)拓?fù)浣Y(jié)構(gòu)線性(ii)分子類型蛋白質(zhì)(xi)序列描述SEQ ID NO12Met Ala Leu Ile Pro Leu Ser Gln Asp Leu Ala Asp Ile Leu Ala Pro1 5 10 15Tyr Leu Pro Thr Pro Pro Asp Ser Ser Ala Arg Leu Pro Phe Val Thr20 25 30Leu Thr Tyr Ala Gln Ser Leu Asp Ala Arg Ile Ala Lys Gln Lys Gly35 40 45Glu Arg Thr Val Ile Ser His Glu Glu Thr Lys Thr Met Thr His Tyr50 55 60Leu Arg Tyr His His Ser Gly Ile Leu Ile Gly Ser Gly Thr Ala Leu65 70 75 80Ala Asp Asp Pro Asp Leu Asn Cys Arg Trp Thr Pro Ala Ala Asp Gly85 90 95Ala Asp Cys Thr Glu Gln Ser Ser Pro Arg Pro Ile Ile Leu Asp Val100 105 110Arg Gly Arg Trp Arg Tyr Arg Gly Ser Lys Ile Glu Tyr Leu His Asn115 120 125Leu Gly Lys Gly Lys Ala Pro Ile Val Val Thr Gly Gly Glu Pro Glu130 135 140Val Arg Glu Leu Gly Val Ser Tyr Leu Gln Leu Gly Val Asp Glu Gly145 150 155 160Gly Arg Leu Asn Trp Gly Glu Leu Phe Glu Arg Leu Tyr Ser Glu His165 170 175His Leu Glu Ser Val Met Val Glu Gly Gly Ala Glu Val Leu Asn Gln180 185 190Leu Leu Leu Arg Pro Asp Ile Val Asp Ser Leu Val Ile Thr Ile Gly195 200 205Ser Lys Phe Leu Gly Ser Leu Gly Val Ala Val Ser Pro Ala Glu Glu210 215 220Val Asn Leu Glu His Val Asn Trp Trp His Gly Thr Ser Asp Ser Val225 230 235 240Leu Cys Gly Arg Leu Ala24權(quán)利要求
1.編碼具有SEQ ID NO2所示氨基酸序列的多肽、或編碼SEQ ID NO2所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
2.編碼具有SEQ ID NO4所示氨基酸序列的多肽、或編碼SEQ ID NO4所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
3.編碼具有SEQ ID NO6所示氨基酸序列的多肽、或編碼SEQ ID NO6所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
4.編碼具有SEQ ID NO8所示氨基酸序列的多肽、或編碼SEQ ID NO8所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
5.編碼具有SEQ ID NO10所示氨基酸序列的多肽、或編碼SEQ ID NO10所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
6.編碼具有SEQ ID NO12所示氨基酸序列的多肽、或編碼SEQ ID NO12所示多肽的類似物或衍生物的DNA序列,所述類似物或衍生物中一個或多個氨基酸已被缺失、插入或被其他氨基酸所替代、而基本不降低該多肽的酶促作用。
7.含有一種或多種權(quán)利要求1-6中的DNA序列的表達(dá)載體。
8.已被權(quán)利要求7的表達(dá)系統(tǒng)轉(zhuǎn)化的宿主生物。
9.制備核黃素的重組方法,該方法利用權(quán)利要求8的宿主生物。
全文摘要
本發(fā)明涉及真菌棉阿舒囊霉中的核黃素生物合成基因和用這些基因及基因產(chǎn)物制備核黃素的遺傳工程方法。
文檔編號C12N15/00GK1146781SQ95192767
公開日1997年4月2日 申請日期1995年3月15日 優(yōu)先權(quán)日1994年3月25日
發(fā)明者J·L·里夫猶爾塔多瓦爾, M·J·布特拉格瑟納, M·A·桑托斯加爾希亞 申請人:巴斯福股份公司