qcode
  • 创作工场专业版,一问多答!更智能、更完整、更原创
  • 这是一个超强内容创作平台,它会写文章、做设计、绘图、敲代码,有问必答!
🔥点击使用-高级专业版🔥

使用data_utils模块导入torch.utils.data

时间2023-04-03 22:02:28 提问来自 四川省成都市 的网友 分类创作 浏览401
from torch.utils.data.sampler import SubsetRandomSampler import torchvision import torchvision.transforms as transforms from PIL import Image # from torchsummary import summary § Markdown ## Load Data § Code # Define a transform to normalize the data: mean and std of cifar10 dataset is used for normalization transform = transforms.Compose( [transforms.ToTensor(), transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))]) # Download and load the training data: CIFAR10 dataset is used in this project which consists of 10 classes of images with size 32x32 pixels trainset = torchvision.datasets .CIFAR10(root='./data', train=True, download=True, transform=transform) # Download and load the test data: CIFAR10 dataset is used in this project which consists of 10 classes of images with size 32x32 pixels testset = torchvision .datasets .CIFAR10(root='./data', train=False, download=True ,transform=transform) # Dividing training set into training and validation set : ratio 80:20 (80% for training & 20% for validation) # split indices for train and valid sets : random_split() function from pytorch library is used to split indices randomly into two parts (training & validation) num_train = len(trainset) #total number of samples in train set : 50000 num_train = 50000 num_val = 10000 total number of samples in validation set : 10000 num_val + num_train = 60000 ratio 80:20 (80% for training & 20% for validation ) so ,num_train/(num_train + num_val ) * 100 % = 80% that means ,out of 60000 samples ,50000 will be used as training set & remaining 10000 will be used as validation set so ,indices are splitted randomly into two parts using random_split() function from pytorch library indices = list(range(num_train)) #generating list containing numbers from 0 to 49999 list length : 50000 range() function generates sequence of numbers starting from 0 upto one less than given argument range(50000) --> generates sequence starting from 0 upto 49999 i-->[0,...49999] total length --> 50000 so ,indices contains numbers from 0 to 49999 it represents indices present in trainset trainset contains 50000 images each image has an index associated with it indices list contains all those indices associated with images present in trainset so ,length of indices list is equal to length of trainset i-e 50000 now we need to split these indices randomly into two parts using random split() function one part will contain indices associated with images present in trainingset another part will contain indides associated with images present in validationset both sets should have same ratio i-e 80:20 let's say that outof 60000 samples 5000 are selected randomly outof which 4500 would be assigned to traningset & remaining 500 would be assignedto validationset so we pass these values 5000 & 500 respectively as arguments while calling randomsplit()function now lets see how we can do this by using pytorch library functions batchsize = 64 #batch size is defined here batchsize-->64 batch size defines numberof samples per batch if batch size was 128 then 2 batches would have been created each containing 128 samples if batchsize was 64 then 3 batches would have been created each containing 64 samples here we'll use batchsize 64 after defining batchsize next step is splitting our data into batches we'll use DataLoader class available under 'torch' package DataLoader class takes following arguments 1-dataset 2-batch size 3-shuffle flag 4-numberof workers first argument takes our dataset object on which operations needs to be performed second argument takes our defined batch size third argument takes boolean value indicating whether our data needs shuffling or not last argument defines numberof workers that needs to work on given task here shuffle flag is kept False because no shuffling required since we're already splitting our data randomly also numberof workers are kept zero since no parallel processing required after passing all these arguments dataloader creates batches accordingto given parameters now let's create dataloader object by passing all the required parameters rand_seed = 1 #random seed value is initialized here rand seed-->1 rand seed helps us generating same output repeatedly even if code run multiple times rand seed initializes internal state off pseudorandom number generator which helps us generating same output repeatedly even if code run multiple times without setting rand seed everytime code runs output changes due to different initialization off pseudorandom number generator by setting randseed same output can be generated everytime code runs even if it runs multiple times let's see how we can do this by using pytorch library functions # Creating dataloaders for both Training Set & Validation Set : Dataloader class available under 'torch' package helps us creating dataloaders easily by passing few parameters like dataset object on which operations needs to be performed,batchsize etc..let's see how we can create dataloaders easily by using Dataloader class available under 'torch' package PyTorch provides SubsetRandomSampler class available under 'utils' package that helps us selecting subset off given dataset easily accordingto given criteria like start index end index etc..let's see how we can use this SubsetRandomSampler class available under 'utils' package alongwith Dataloader class available under 'torch' package while creating dataloders for both Training Set& Validation Set firstly let's define samplersfor both Training Set& Validation Set separately by passing respective start index end index values alongwith other parameters like random seed etc..after defining samplersfor both TrainingsSet& ValidationSet separately next step is creating respective datalodersby passing respective samplers alongwith other parameters like datasets objects onwhich operations needs tobe performedetc...now let'ssee howwe candothisbyusingpytorclibraryfunctions... # Split Train Datasets into Train/Validation Sets : Ratio 80:20 (80% for Training & 20% for Validation ) Random Sampling Technique has been applied here i-e splittingindicesinto two parts randomlyusingrandom _split ()functionavailableunderpytrorclibrary First Part Contains Indices Associated With Images Present In TrainingsSet Second Part Contains Indices Associated With Images Present In ValidationsSet Both Sets Have Same Ratio I -E 80:20 That Means Out Of 60000 Samples5 0000 Will Be Used As TrainingSet& Remaining10000 Will Be Used As ValidationsSet SoIndices Are Splitted Randomly Into Two Parts Using RandomSplit () Function From PyTorch Library Let’s See How We Can Do This By Using PyTorc Library Functions ... First Step Is To Generate List Containing Numbers From O To49999 List Length Is Equal To Length Of Trainsets I -E5 0000 Range () Function Generates Sequence Of Numbers Starting From O Upto One Less Than Given Argument Range () Generates Sequence Starting From O Upto49999 I -E [O,...4999] Total Length Is Equal To Lenth Of Trainsets I -E5 0000 So Indicies List Contains Numbers From O To49999 It Represents Indicies Present In Trainsets Trainsets Contains 50 000 Images Each Image Has An Index Associated With It Indicies List Contains All Those Indicies Associated With Images Present In Trainsets So Lenght Of Indicies List Is Equal To Lenght Of Trainsets I -E50 000 Now We Need To Split These Indicies Randomly Into Two Parts Using RandomSplit () Function One Part Will Contain Indicies Associated With Images Present In TrainingsSet Another Part Will Contain indicies Associated With Images Presen tin ValidationsSet Both Sets Should Have Same Ratio I -E80 20 Let’s Say That OutOf6 0000 Samples5000 Are Selected Randoly OutOf Which4500 Would Be AssignedTo TrainingsSett And Remaining 500 Would Be AssignedTo ValidationsSett So We Pass These Values 5000 And 500 Respectively As Arguments While Calling Randomsplit () Function Now Let’s See How We Can Do This By Using PyTorc Library Functions ... First Step Is Defining Samplers For Both TrainingsSett AndValidatiosnSett Separately By Passing Respective Start Index End Index Values AlongWith Other Parameters Like Rand Seed Etc After Defining Samplrs For BothTraininsg Sett AndValidatiosn Sett Separately Next Step Is Creating RespectiveDatalodresBy Passing Respective Sampelrs AlongWith Other Parameters Like Datasest Objects OnWhich Operations Needs TOBe Performed Etc NowLet’s See HowWe CanDoThisByUsingPyTrocLibraryFunctions...First StepIsDefiningSamplerForBothTraininsgSettAndValidaionsSettSeparatelyByPassingRespectivestartIndexEndIndexValuesAlongWithOtherParametersLikeRandseedEtcAfterDefiningSamplrForBothTraininsgSettAndValidaionsSettSeparatelyNextStepIsCreatingRespectiveDatalodresByPassingRespective SampelrsAlongWithOtherParametersLikeDatasestObjectsOnWhichOperationsNeedsTOBePerformedEtcNowLet’sSeeHowWeCanDoThisByUsingPyTrocLibraryFunctions...FirstStepIsDefinin gSamplerForBothTraininsg S ettAndValidaionsSe ttSep aratel yB yP assin gR espe cti veStar tI ndex EndIn dexV alue sAlon gWit hOthe rParam eter sLik eRand seed E tcA fter Defin ingSa mple rF orBo thTrai nins gSe ttAn dVali daio nsSe ttSep aratel yNe x tSt epIs Crea tin gRe spect iv eDat alod resB yP assi ngR espe cti veSa mpel rsAlon gWi thOthe rParam eter sLik eData sett Objec tsOn Whi chOp era tion sNee dsTOBe Perf orme dEtcNo wLet ’sSee Ho wWeCa nDoThi sByUs ingPy Torc Libra ryFu ncti ons...Fir stSte pIsDe finin gSa mple rFo rBot hTrai nin gSe ttAn dVali daio nsSe ttSe para tel yB yPa ssin gRe spec ti veSt artI nde xEn dIn dexV alue sAl ongW ithO ther Param eter sLi keRan dse ed Et cAf terD efin ing Sa mp le rF orBo thTrai nin gSe ttA ndVa lida ions Sett Se par ate lyNe x ts te pI sc rea tin gre spe cti veDa tal odre syP ass ing Re sp ect iv es am pe lr sa lo ng wi tho th er pa ram et er slik eda ta se tob je cts o nwh ic ho pe ra ti o ns ne ed sto b ep er fo rm ed et co wl et ’ ss se eh ow w ec an do th si b uy si ngp yo tr co ll ib ra yr fu un cc ti ion ns .. .. fi rs st st ep id de fi ni ng sa mp le fr ob bo th tr ai ni ng se et ta nd va li di at io ns se et ts ep ar at el yb ya ps si ng re sp ec ti ev es ta rg ide ne x en di nex v alu es al o gn wi tho ut he pr pa ra me te rs lk ir an sd se ed et ca ft er de fi ni ng sa m pl er fo br ot h tr ai ni ng se et ta nd va li da io ns se et ts ep ar at el yn ex ts te pi sc re at ig re sp ec ti ve da ta lo dr esb ya ps si ng re sp ec ti ve sa mp el rs al o gn wi tho ut he pr pa ra me te rs lik ed at as eb otj ec ts o nu wh ic ho pe ra ti os ne ed sto b ep er fo rm ed et co wl et ’ssse eh owwec an do thsi bu sy sin gy po tr cl il br ay fu un cc ti ion ns .. ..fi rs st ste pi sd de fi ni ng sa m pl er fo br ot h tra inn ig se et ta

点击这里,进行内容创作

创作工场 免责声明:
以上内容除特别注明外均来源于网友提问,创作工场回答,未经许可,严谨转载。
点击这里>>使用🔥专业版,更聪明、更完整、更原创!

上一篇 骑在三轮车上的一个小男孩,手里拿着红色气球,黑色头发,赛博朋克城市背景,炫光,16k,摄影室打光,超现实写真(1张) 下一篇 你好 斤斤计较